Pages

Thursday, August 14, 2025

Splitting Large Files using `split` command and `for` loop (for sequence)

Looping through and splitting sequentially named large files.

 

for i in {1..4}; do split -d -l 20000 Phase-2-Slot-$i.csv p2s$i\_ --additional-suffix=\.csv; done


End results

Input files:

Phase-2-Slot-1.csv
Phase-2-Slot-2.csv
Phase-2-Slot-3.csv
Phase-2-Slot-4.csv


Output files:

For Phase-2-Slot-1.csv -> output files are p2s1_00.csv, p2s1_01.csv, p2s1_02.csv so on

For Phase-2-Slot-2.csv -> output files are p2s2_00.csv, p2s1_02.csv, p2s1_02.csv so on

Take away Points:

  • Escaping Underscore:

    The variable $i has been used in output file names and an underscores ( _ ) after it Neither the variable nor the underscore will be reflected in the output file name if the underscore is not escaped. The variable $i_ is valid and different from our $i. The work around is to escape the underscore ( _ ) after the variable $i . Note the escaped underscore in the above command  ( \_ )
  • The sequential loop using for command

    Especially note the curly braces, (not parenthesis) and the two dots in between the integers.

If the leading integer is smaller than the trailing one, the counter will step up and vice versa.

for i in {a..b}  a < b := Step up counting
for i in {a..b}  a > b := Step down counting