Page tree
Skip to end of metadata
Go to start of metadata

This page covers some of the lessons learned when writing solutions in C++ for programming contest problems. Many of these problems require a substantial number of integer values to be read in and this can contribute significantly to the over execution time. This page covers some of the methods that can be used to reduce this time penalty.

Testing Method

To evaluate different methods of reading integer data, a large text input file of 100 million integer numbers was used. These numbers are read using the function below to compute a final sum (simply to keep the code from being optimized away):

Sum Function
void compute_sum()
{
	int total_numbers = get_next_int();
	long long sum = 0;
	for (int i = 0; i < total_numbers; i++)
		sum += get_next_int();
	printf("%lld\n", sum);
}

1) C++ STD::CIN

The most standard approach to reading numbers is to use the C++ standard library. This is simple and easy to use:

STD::CIN
int get_next_int()
{
	int number;
	std::cin >> number;
	return number;
}

 

2) C++ STD::CIN (Untied and Unsync streams)

Using the approach above, you can increase performance if you:

  1. Disable the synchronization between C and C++ standard streams
  2. Untie the cin and cout streams

For contests, this is safe to do as long as you don't mix C and C++ stream IO:

STD::CIN (Untied and Unsync)
void init_io()
{
	std::cin.sync_with_stdio(false);
	std::cin.tie(NULL);
}

int get_next_int()
{
	int number;
	std::cin >> number;
	return number;
}

 

3) C Scanf

This is the old C standard way of reading in data:

Simple STD::CIN Solution
int get_next_int()
{
	int number;
	scanf("%d", &number);
	return number;
}

 

4) Dedicated Integer Reader

This function is only useful when all the input is integer numbers (it cannot be used with any other input methods):

Dedicated Integer Reader
const int max_buffer_size = 1048576; // 1 Megabyte
char buffer[max_buffer_size];
char* maxChar = buffer;
char* curChar = buffer;

inline char get_next_char()
{
	if (curChar == maxChar)
	{
		// Fill Buffer
		int read = fread(buffer, 1, max_buffer_size, stdin);
		maxChar = buffer + read;
		curChar = buffer;
	}
	return *(curChar++);
}

int get_next_int()
{	
	// Skip to next int
	char c;
	do
		c = get_next_char();
	while ((c < '0' || c > '9') && (c != '-'));

	// Deal with Negative Numbers
	int sign = 1;
	if (c == '-')
	{
		sign = -1;
		c = get_next_char();
	}

	// Compute the actual number
	int number = 0;
	do
	{
		number = (number * 10) + (c - '0');
		c = get_next_char();
	} while (c >= '0' && c <= '9');

	// Final Number
	return number * sign;
}

 

5) Raw IO

The most basic comparison is the raw reading from a file. This reads the entire input file into memory one block at a time without the conversion to numbers.

Raw IO
#include <stdio.h>
const int max_buffer_size = 1048576; // 1 Megabyte
char buffer[max_buffer_size];

void raw_io()
{
	int total_to_read = 582398894;
	while (total_to_read > 0)
	{
		// Fill Buffer
		int read = fread(buffer, 1, max_buffer_size, stdin);
		total_to_read -= read;
	}
}

 

Results

For all timing results, the test was run four times with the slowest result discarded and the remaining the results averaged.

The testing system was a Windows 10 64-bit machine with an intel i5-2500K (overclocked to 4.3Ghz) and 16GB of RAM.

#MethodExecution TimeNumbers Read per Second
1C++ STD::CIN61.14 seconds1,635,455
2C++ STD::CIN (Untied and Unsync streams)51.45 seconds1,943,322
3C Scanf24.76 seconds4,037,397
4Dedicated Integer Reader1.89 seconds52,678,897
5Raw IO0.64 seconds-

Conclusions

On a contest with a 3 second budget for solving the problem you have to be very careful what time you spend doing IO. My recommendation is that the IO take no more than 5% of your overall budget which means that it has to be completed in less than 150 milliseconds. Here are some recent contests and how they would fare on their largest test case:

ContestMax Integers1) CIN2) CIN3) Scanf4) Reader
Through A Maze Darkly600,003367 ms316 ms145 ms11 ms
Stupendous Bowties200,20284 ms73 ms33 ms5 ms

Overall conclusions:

  1. Always use Scanf since it is simple to implement and will keep the IO costs down
  2. In case of large datasets or if your solution is close to the edge - use the reader.

 

  • No labels