EXTRACTION

Extracting a list of tags in a list file

  • Here we're looking for the "4-PM-", "NUMMER" and "T4-TAMBOUR-NO" tags, and we copy the lines that contain them in a new file.
for f in *.csv; do
   if [ $(echo $f | grep -c pi_pm4) -gt 0 ]; then
      if  [ $(egrep -a -h "4-PM-|NUMMER|T4-TAMBOUR-NO" "$f" | wc -l) -gt 0 ]; then
         newf=break_pm4_"$f"
         egrep -a -h "4-PM-|NUMMER|T4-TAMBOUR-NO" "$f" > "$newf"
         mv "$newf" "$PathRepo"
      fi
   fi
done

Always retrieving the same line numbers in a file

  • Here we only keep the first line of the file (and the original file)
for f in *.csv; do
   if [ $(echo $f | grep -c actions_JEU ) -gt 0 ]; then
      newfile=Game_$f
      head -n 1 "$f" > "$newfile"
      mv "$newfile" "$PathRepo"
      mv "$f" "$PathRepo"
   fi
done
  • Here we retrieve the first four and last two lines of a file
for f in *.csv; do
   if [ $(echo $f | grep -c RESUME) -gt 0 ]; then
      file_part1=$(echo "$f" | sed "s/.csv/_part1.csv/")
      new_file2=$(echo "$f" | sed "s/.csv/.txt/")
      head -n 4 "$f" > "$new_file2"
      tail -n 2 "$new_file2" > "$file_part1"
      mv "$f" "$PathRepo""ENTIER_"$f
      mv "$file_part1" "$PathRepo"
      rm $new_file2
   fi
done

Retrieving the lines of a file before a particular word

  • Here we get the lines before the word "END" appears (the line with the word "END" will be retrieved also)
for f in *.csv; do
   if [ $(echo $f | grep -c RESUME) -gt 0 ]; then
      LIGNE=$(cat $f | grep END -n -m 1)
      a=$(echo $LIGNE | cut -f1 -d ':' )
      NUM_LIGNE=$((a+0))
      head -n $NUM_LIGNE $f > $new_file
      mv $new_file "$PathRepo"
   fi
done

Extracting data from a Word file

/!\ NOT RECOMMENDED
Only works for .docx files

docx2txt monfichier.docx

Splitting a file that contains several headers into several files

If a file contains several tables, and the columns in each table header aren't the same for example

#Files to split have a .ini extension
for f in *.ini; do
    if [ $(echo $f | grep -c "Marques") -gt 0 ]; then

      # Split the file each time a line starts with the word "[Marque_"
      # The splitted files will have no extension 
      newf=$(echo $f | cut -d "." -f1)_split; 
      csplit -kzf "$newf" "$f" /'\[Marque_'/ {*}
      rm "$f";

    #Add csv extension 
      for fs in "$newf"*; do 
        mv "$fs" "$PathRepo""$fs".csv ;
      done
   fi
done

Duplicate a file and rename it

if [ $(echo "$f" | grep -c "^todos") -gt 0 ] ; then
        cp $f "Time_$f" # We copy the files STARTING by todos and we rename it in Time_todos
    fi
if [ $(echo "$f" | grep -c "todos") -gt 0 ] ; then
        cp $f "Time_$f" # We copy the files CONTAINING by todos and we rename it in Time_todos
    fi

Was this article helpful?

Powered by Zendesk