CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – V

In the previous articles of this series about an Ajax controlled file upload with PHP progress tracking

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – IV
CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – III
CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – II
CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – I

we have studied elements of an Ajax controlled setup to trigger the upload itself and additional independent jobs for progress tracking.

On the Javascript side we first became acquainted with Control objects [CtrlOs] to fully control HTML elements and the Ajax communication of different forms and objects with the PHP server. We then learned that our main PHP job [handle_uploaded_init_files.php5] triggered from the file upload form is of no use for progress tracking.

In the last article we therefore discussed some Javascript methods to start a sequence of independent Ajax controlled polling jobs. Such jobs can retrieve information about the progress of the file upload from the server periodically and in the background. We learned in addition that we may need some adaption of our polling time interval to the measured data transfer rate. We expect to get the rate information from the server together with some standard progress information as e.g. the sum of bytes uploaded at the moment of polling.

The progress information for the file transfer is continuously updated in the $_SESSION array on the LAMP server according to settings in the php.ini configuration file. In this article we now discuss the rather simple PHP job for fetching the information and sending it back to the client (browser).

The PHP polling job to read progress information and send it to the client

The periodically called PHP job, which we named “check_progress.php5”, basically contains four steps:

  • Access the session and start output buffering,
  • instantiate an object which controls information gathering,
  • fetch information from the $_SESSION array,
  • encode and send the JSON object to the Ajax client

PHP program “check_progress.php5”

// start session and output buffer
session_start();
ob_start(); 

// Instantiate progress checker object 
$PC = new ProgressChecker();  

// retrieve and send information 
$PC->check_progr(); 
$PC->encodeAndSendJson(); 
exit; 

// ------------------------------
class ProgressChecker  
{
	//Some variables 
	...
	function __construct() {
		....
	}
	function check_progr() {
		....
	}
	function encodeAndSendJson() {
		....
	}
}

Why did we need the output buffering ob_start()? By using the buffer we control unwanted output produced e.g. by warnings of the PHP interpreter! We want to guarantee that our Ajax client really receives a JSON object (which it expects!) and nothing else. See the third article of this series for more remarks on this
point.

Remark, 05.07.2015: We have corrected a mistake in the originally published code. “ob_end_clean()” was removed from the main program code. Note that “ob_end_clean()” must be performed just before the Ajax response is sent. This happens in the method encodeAndSendJson(); see below.

Although not required we put all functionality into a class. This will give us more flexibility in case we need to extend the functionality of the object for progress tracking later on.

Some variables of the class “ProgressChecker”

In our class we first set some initial values of elements of an array “$ajax_response[]“. This array will later on be encoded as the JSON response object of our Ajax transaction.

	// The array to build the JSON object from 
	var $ajax_response = array();
	// Some useful variables 
	var $percentage = -1;
	// Error handling 
	var $err = 0;
	// variables to compose the key of progress information in $_SESSION
	var $sess_key_progress 	= '';
	var $key_POST = "progress_key";
	
	function __construct() {
		
		$this->ajax_response['sess_key_progress'] = '';
		
		$this->ajax_response['msg'] = '';
		$this->ajax_response['err'] = 0;
		$this->ajax_response['err_msg'] = '';
		$this->ajax_response['sys_msg'] = '';
		
		$this->ajax_response['finalized'] = 0;
		$this->ajax_response['time'] = '';
		$this->ajax_response['secs'] = 0.0;
		$this->ajax_response['diff_secs'] = 0.0;
		$this->ajax_response['end_time'] = '';
		$this->ajax_response['end_secs'] = 0.0;
	
		$this->ajax_response['bytes'] = -1;
		$this->ajax_response['diff_bytes'] = -1;
		$this->ajax_response['total'] = -1;
		$this->ajax_response['percentage'] = -1;
		$this->ajax_response['rate'] = -1.0;
	
		$this->ajax_response['fs_limit'] = -1;
		$this->ajax_response['max_inp_time'] = -1;
	}

 
We organize the information that may be required by the client: the composed key to access progress information in $_SESSION, some time information and real progress information.

“bytes” represents the number of Bytes received so far on the server, “total” is the total number of bytes of the transferred file (i.e. its size), “percentage” is calculated from “bytes” and “total”. In addition we deliver a measured data transfer “rate” in [Bytes/msec]. The reader may compare this with the result handling on the Javascript client side described in the last article of this series.

Last but not least we initialize some variables which shall deliver useful information about parameters set in the “php.ini” file on the server.

You see that we intentionally set some variables to negative values. Such an initialization can help to analyze potential errors on the server and especially after some response has been returned on the client.

Reading progress information from the $_SESSION array

The core of our class is the method check_progress():

function check_progr() {
	
	// Compose the key for the information in the $_SESSION array  
	if (!isset( $_POST[$this->key_POST] ) ) {
		$this->err++;
		$this->ajax_response['err'] = 1;
		$this->ajax_response['err_msg'] .= '<br>The required key was not delivered in $_POST';
	}
		
	if ( $this->err == 0 ) {
		$this->sess_key_progress .= ini_get("session.upload_progress.prefix"). $_POST[$this->key_POST];
		$this->ajax_
response['sess_key_progress'] = $this->sess_key_progress;
	}
		
	// Check for progr. information 
		// We have set session.upload_progress.cleanup = Off
		// So we do have some information here
			
	if (!isset($_SESSION[$this->sess_key_progress]) || empty($_SESSION[$this->sess_key_progress])) {
		$this->err++;
		$this->ajax_response['err'] = 2;
		$this->ajax_response['err_msg'] .= '<br>Error: The progress information could not be read from $_SESSION';
		return; 
	}
			
	// Save some information at the first poll for later rate calculations 	
	if (!isset($_SESSION['first_secs'])) {
		$_SESSION['first_secs'] = microtime(true);
	}
	if (!isset($_SESSION['first_bytes'])) {
		$_SESSION['first_bytes'] = $_SESSION[$this->sess_key_progress]["bytes_processed"];
	}
		
	// Get some php.ini parameters
	$file_size_limit = ini_get("upload_max_filesize");
	$this->ajax_response['fs_limit'] = $file_size_limit; 
	$this->ajax_response['max_inp_time'] = ini_get("max_input_time"); 

	// investigate whether file size is too big and return error 
	$file_size = $total / 1024 / 1024; 
	if ( $file_size > $file_size_limit ) {
		$this->err++; 
		$this->ajax_response['err'] = 3;
		$this->ajax_response['err_msg'] .= "<br>Error: The file has a size of " . number_format($file_size, 2, '.', '') . " MByte and is bigger than the allowed limit on the server (" . $file_size_limit . " MByte)"; 
		return; 	
	}
	
	// Retrieve progress information 
			
	// Bytes
	$current = $_SESSION[$this->sess_key_progress]["bytes_processed"];
	$total 	 = $_SESSION[$this->sess_key_progress]["content_length"];
	
	$this->ajax_response['percentage'] = ($current < $total) ? ($current / $total * 100) : 100;
	$this->ajax_response['bytes'] = $current;
	$this->ajax_response['total'] = $total;
		
	$this->ajax_response['time'] = date('%d.%m.%Y-%H:%M:%S');
	$this->ajax_response['secs'] = microtime(true);
	$this->ajax_response['first_bytes'] = $_SESSION['first_bytes'];
	$this->ajax_response['first_secs'] = $_SESSION['first_secs'];
	$this->ajax_response['diff_secs'] = $this->ajax_response['secs']  - $_SESSION['first_secs'];
	$this->ajax_response['diff_bytes'] = $this->ajax_response['bytes'] - $_SESSION['first_bytes'];
	
	// Rate in bytes / msec
	if ($this->ajax_response['diff_secs'] > 1.0 ) {
		$this->ajax_response['rate'] = floor($this->ajax_response['diff_bytes'] / ( $this->ajax_response['diff_secs'] * 1000.0));
	}
		
	// upload finalized ? if yes => cleanup of $_SESSION 
	$done_upl 	= $_SESSION[$this->sess_key_progress]["done"];
	$done_file 	= $_SESSION[$this->sess_key_progress]["files"][0]["done"];
		
	if ($done_upl && $done_file) {
		$this->ajax_response['end_time'] = $this->ajax_response['time'];
		$this->ajax_response['end_secs'] = $this->ajax_response['secs'];
	
		// cleanup of progress variables
		// the real session will be closed by the upload job
		unset($_SESSION[$this->sess_key_progress]);
		unset($_SESSION['first_secs']);
		unset($_SESSION['first_bytes']);
	}
		
}

 
As we already know, we first have to compose the key to access progress information in the $_SESSION array. We must use

  • both information provided by the client to uniquely identify the file transfer process to which our polling job refers
  • and some specific information of the “php.ini” file.

See the third article of this series for more information about this point. Remember that we explicitly took care about the fact that key relevant information reaches the server first – i.e. before the file data.

In the block of program statements we obviously test
the existence of progress data – and stop our activities with setting an error flag and an error message in our response array. Such an error situation must be recognized and handled on the client side; see the last article of this series.

Then we add some initial time information and byte information to new elements of the $_SESSION array. Obviously, this happens only at the first polling job. This stored information will later on be used for rate estimations.

Important note:

Our simplified approach to rate estimation via additional $_SESSION variables will only work if you trigger exactly one upload process at a time. If you triggered several upload jobs in parallel (i.e. before previously started jobs have finalized and within one and the same session) you would need to make all progress information which you add to $_SESSION dependent on the identifier of the transfer process.

We leave this extension to the reader and assume that different upload jobs are started from the client only sequentially.

We now fetch 2 parameter values from the “php.ini” file: for the maximum allowed size of upload files and the maximum allowed input parsing time. The latter value determines how much time can be used on the server to deal with the transferred (POST) data. This is a critical limit about which the client should be informed.

In addition: If we find that the size of the transferred file is too big, we return an error to the client.

Next, we retrieve progress information, calculate the reached percentage of transferred bytes and estimate a transfer rate. To avoid any problems with divisions by zero we deliver a rate value only after at least a second (i.e. the default the update time interval on the server for progress information) has passed. Remember that these rate values are used on the client side to adapt the polling period to reasonable values.

Finally, we check by some criteria whether the transfer job has already finished and all file data are received. If this is the case, we erase progress related information from the $_SESSION array. This is absolutely necessary if you want to start further upload jobs during the same PHP session.

Important note:

We are only able to retrieve the information about the end of the file transfer if the progress information is not automatically erased from the $_SESSION array by the tracking engine of the server itself. Otherwise our last polling job would almost always run into problems. So, for our Ajax controlled polling to work flawlessly, we must set:
session.upload_progress.cleanup = Off
in our “php.ini”-file – and do some cleanup ourselves.

At this point I also want to remind the reader that our main job for dealing with the uploaded file data may already have started before the last polling job is triggered. Due to the length of polling time interval such a situation is very probable. We had to take care of this situation on the Javascript client side to avoid a confusing mixture or overwriting of returning messages from both PHP jobs; see the last article of this series for more remarks on this point.

Encoding the JSON object

The final step is trivial:

function encodeAndSendJson() {
	$this->ajax_response['sys_msg'] .= ob_get_contents(); 
	ob_end_clean();
	$response = json_encode($this->ajax_response);
	echo $response; 
}

Some example output of the Ajax controlled client/server interaction for the file upload

After all this theory let us now have a brief look of output of the
Ajax interaction between the client and the server. Initially we give the user a chance to select a file

upl_form_1
 
The PHP job that created the web page which contains our upload form also started the required PHP session. You may ignore the upper parts of the displayed excerpt of our web page. The interesting things happen in the lower form which allows for file selection.

The attached information elements (progress bar, small text areas) below actually belong to a separate and different web page area (DIV container) which contains an invisible form for the control of the progress polling jobs.

Note that in the displayed test case we have chosen a ZIP file container (which actually contains 5 files with 5 million records each). Then we start the Ajax controlled transfer with a click on the button “Start Upload”. This job starts the file transfer and the first Ajax transaction. This will trigger the main PHP job “handle_uploaded_init_files.php5” for file processing when the data transfer has finalized.

upl_form_2

In parallel the sequence of polling jobs is automatically started in the background. Each Ajax transaction in the time loop triggers a job “check_progress.php5” on the server, which retrieves progress information from $_SESSION as discussed above. The information of each PHP polling job returned to the client via a JSON object is used to enlarge the progress bar and to display additional information text about the progress.

upl_form_3
upl_form_4

Note the display of the bytes transferred, the time elapsed and the remaining time estimated from the measured rate.

After all file data have been received at the server we display some final information in green on the left side. After that the main PHP job (triggered by our original form submit) takes over and the status bar runs again – this time indicating the processing of the individual files we moved to the server in our Zip file container.

upl_form_6

We turn to the handling of the ZIP container and its contents in one of the next articles of this series.

Adaption of the polling interval

In our previous example the polling period stayed at the lowest boundary of 1200 msec. The file (around 80 MByte) was not big enough for an adjustment at the given rate. However, a different example with a bigger file around 150 MByte shows the adaption of the polling period quite clearly. Excerpts from respective messages in the console.log:

time = 1433754422037, 
polling_
period = 1200

Present upload percentage = 3.0057897183539
rate is 1485, poll_soll = 1501,
polling_period = 1440

Present upload percentage = 5.0072492005327
rate is 1341, poll_soll = 1652  
polling_period = 1652

Present upload percentage = 7.0087017625639
rate is 1235, poll_soll = 1785
polling_period = 1652

Present upload percentage = 8.0094300619559
rate is 1209, poll_soll = 1821
polling_period = 1821

Present upload percentage = 11.011615536811
rate is 1250, poll_soll = 1765
polling_period = 1821


Present upload percentage = 12.012343836203
rate is 1219, poll_soll = 1807
polling_period = 1821

Present upload percentage = 13.013072712274
rate is 1193, poll_soll = 1844
polling_period = 1821

Present upload percentage = 14.013802741702
rate is 1172, poll_soll = 1875
polling_period = 1821

Present upload percentage = 16.015263377239
rate is 1231, poll_soll = 1790 
polling_period = 1821

Present upload percentage = 17.01599225331
rate is 1211, poll_soll = 1818
polling_period = 1821

Present upload percentage = 18.016722859418
rate is 1193, poll_soll = 1844
polling_period = 1821

Present upload percentage = 19.017454042205
rate is 1177, poll_soll = 1868
polling_period = 1821

Present upload percentage = 21.018911217668
rate is 1206, poll_soll = 1825
polling_period = 1821

 
The rate changes from 1200 msec to 1821 msec in 4 steps. That is exactly what we wanted to achieve.

However, there is a glitch of one additional percent every 4th step. This can be explained by the fact that our polling interval allows for around 1.2 % effective change but that we pick up only a 1% change due to the servers settings between 2 polling events. As the update period on the server is a bit smaller and thus a bit asynchronous to our polling events a glitch is bound to happen around every 4th event on the server. The reader may sketch the polling events and boundaries of the update period on the server in a drawing to get a clear understanding. We can ignore the glitch for all practical purposes.

Note that for larger files at the same rate the adaption would take more steps – but the behavior would more or less be the same.

A first conclusion

So far, so good. Obviously, it is possible to use the progress tracking of PHP versions ≥ 5.4 in combination with a series of Ajax controlled polling jobs. The Javascript side was a bit tricky – but we got it working – and we now have full control about the upload process and its messages. At least in principle …

However, there are still some severe issues. Especially, if the upload file is big and the rate is small. Then our mechanism may run into trouble due to parameter settings of the PHP engine. In the next article of this series

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – VI

we will therefore discuss how we can get some control over situations in which violations of server limits become probable.