Lexicographically sorting large files in Linux

When I hear the word “sort” my first thought is usually “Hadoop”! Yes, sorting is one thing that Hadoop does well, but if you’re working with large files in Linux the built-in sort command is often all you need.

Let’s say you have a large file on a host with 2GB or more of main memory free. The following sort command is a efficient way to lexicographically-order large files.

LC_COLLATE=C sort --buffer-size=1G --temporary-directory=./tmp --unique bigfile.txt

Let’s break this command down and examine each part in detail.


( ! ) Warning: count(): Parameter must be an array or an object that implements Countable in /var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php on line 169
Call Stack
#TimeMemoryFunctionLocation
10.0018412168{main}( ).../index.php:0
20.08384270368Joomla\CMS\Application\SiteApplication->execute( ).../index.php:49
30.08384270368Joomla\CMS\Application\SiteApplication->doExecute( ).../CMSApplication.php:196
40.285311392432Joomla\CMS\Application\SiteApplication->dispatch( ).../SiteApplication.php:233
50.285911416504Joomla\CMS\Component\ComponentHelper::renderComponent( ).../SiteApplication.php:194
60.287211471832Joomla\CMS\Component\ComponentHelper::executeComponent( ).../ComponentHelper.php:377
70.287511498800require_once( '/var/www/vhosts/shan.info/httpdocs/components/com_k2/k2.php' ).../ComponentHelper.php:402
80.294711895288K2ControllerItem->execute( ).../k2.php:64
90.294711895288K2ControllerItem->display( ).../BaseController.php:710
100.303212539696K2ControllerItem->display( ).../item.php:78
110.303212539696K2ControllerItem->display( ).../controller.php:19
120.307312906752Joomla\CMS\Cache\Controller\ViewController->get( ).../BaseController.php:663
130.309312927120K2ViewItem->display( ).../ViewController.php:102
140.365615732112K2ViewItem->display( ).../view.html.php:742
150.365615732112K2ViewItem->loadTemplate( ).../HtmlView.php:230
160.367215904416include( '/var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php' ).../HtmlView.php:701

( ! ) Notice: Only variables should be assigned by reference in /var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php on line 478
Call Stack
#TimeMemoryFunctionLocation
10.0018412168{main}( ).../index.php:0
20.08384270368Joomla\CMS\Application\SiteApplication->execute( ).../index.php:49
30.08384270368Joomla\CMS\Application\SiteApplication->doExecute( ).../CMSApplication.php:196
40.285311392432Joomla\CMS\Application\SiteApplication->dispatch( ).../SiteApplication.php:233
50.285911416504Joomla\CMS\Component\ComponentHelper::renderComponent( ).../SiteApplication.php:194
60.287211471832Joomla\CMS\Component\ComponentHelper::executeComponent( ).../ComponentHelper.php:377
70.287511498800require_once( '/var/www/vhosts/shan.info/httpdocs/components/com_k2/k2.php' ).../ComponentHelper.php:402
80.294711895288K2ControllerItem->execute( ).../k2.php:64
90.294711895288K2ControllerItem->display( ).../BaseController.php:710
100.303212539696K2ControllerItem->display( ).../item.php:78
110.303212539696K2ControllerItem->display( ).../controller.php:19
120.307312906752Joomla\CMS\Cache\Controller\ViewController->get( ).../BaseController.php:663
130.309312927120K2ViewItem->display( ).../ViewController.php:102
140.365615732112K2ViewItem->display( ).../view.html.php:742
150.365615732112K2ViewItem->loadTemplate( ).../HtmlView.php:230
160.367215904416include( '/var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php' ).../HtmlView.php:701
back to top