Johan Adlers sporadiska skriverier
RSS icon Home icon
  • Convert file extensions to lower case (rename, Linux shell script)

    Posted on November 20th, 2012 Johan Adler 5 comments

    I am using Uni­son more and more, and recently I deci­ded to try to use it to keep my digi­tal image archive synchro­ni­zed. It seems that at some time I have not expli­citly instruc­ted the image impor­ting pro­gram of the day (Shotwell being the pro­gram I mostly use) to rename the image files in order to con­se­quently use a lower case file extension.

    Thus I felt the need to rename all image files (in the shotwell fol­der) to a lower case file exten­sion. There may or may not be a pro­gram out there to do this, but I wrote my own shell script for this pur­pose. My expe­ri­ence is that I learn more by doing it myself, and a lot of the tricks I learn can be reu­sed the next time.

    This is how I did it, I can­not pro­mise that it will work for anyone else (inden­ta­tion lost in Word­Press):

    #!/bin/sh
    imgdir=‘pwd‘
    imgucext=$(find ${imgdir} –type f –name ‘.’ | awk –F’.’ ‘NF > 0 { print $NF }’ | sort | uniq | LC_ALL=C egrep ”[[:upper:]]”)
    for ucext in ${imgu­cext}
    do
    lcext=$(echo ${ucext}|awk ’{print tolower($0)}’)
    find ${imgdir} –type f –name “.${ucext}” –print | \
    sed ’{s/ /\\ /g}’ – | \
    sed ’{s/(/\\(/g}’ – | \
    sed ’{s/)/\\)/g}’ – | \
    sed ”{s/\(.
    \.\)${ucext}$/mv –n –v \1${ucext} \1${lcext}/g}” – | \
    sh
    done

    Star­ting in the cur­rent folder/directory, I check what upper case exten­sions are pre­sent, find all files using them, rewrite spa­ces and parent­he­sis for the shell, and cre­ate a rename com­mand for each file, which is then piped to the shell. For debug­ging pur­po­ses the sh close to the end would be sub­sti­tu­ted for a cat.

     

    5 responses to “Convert file extensions to lower case (rename, Linux shell script)” RSS icon

    • It tur­ned out that Uni­son is not what I need in this case.

      What I want to accom­plish is to trans­fer pho­tos from came­ras, down­lo­a­ded to my desk­top box or one of the lap­top boxes, to a uni­fied structure on my NAS (mana­ged by an Alix 2D13 (http://pcengines.ch/alix2d13.htm) from PC Engines.)

      What I need is mostly a one way trans­fer of any files not alre­ady pre­sent in the NAS image sto­rage structure. Uni­son is great for two way synchro­ni­za­tion, but in this case it seems that direct invoca­tion of rsync is supe­rior, and I am cre­a­ting anot­her shell script using rsync for doing what I want.

    • Crap! Files from some sour­ces or eras have upper­case file name and lower­case exten­sion, but not con­si­stently, dif­fe­rent boxes and soft­ware seems to have hand­led this in dif­fe­rent ways.

      Some of the ima­ges that exist in more than one copy, each with dif­fe­rent com­bi­na­tions of uppcase and lower­case let­ters in its name, have been rota­ted, and tags may have been added, and Shotwell or wha­te­ver has been instruc­ted (by me, mea culpa) to write these changes/additions to the file meta­data (EXIF or simi­lar). This makes a simple search for dupli­ca­tes based on file­size and/or check­sum less than perfect.

      I have deci­ded that from now on any digi­tal pho­tos (and other files from digi­tal came­ras, such as movies) shall use fully lower­case file­name and exten­sion. But I can­not just have a script rename all files with any upper­case part of filename/extension since there in quite a few cases exists a file with just that (fully lower­case) name.

      So one cur­rent pro­ject is a script that will iden­tify files with any upper­case cha­rac­ter in eit­her file­name or exten­sion (but not the path, since ${HOME}/Pictures/shotwell/ will always con­tain an upper­case cha­rac­ter), then for each upper­case name con­tai­ning file (“ucf”) look for files in the same directory that do not have the iden­ti­cal name, cre­ate a backup directory structure that the extra ver­sions will then be moved to.

      I know—I should check file­size, modi­fi­ca­tion date and pro­bably MD5 sum for each of those files, delete the iden­ti­cal ones, keep the newest or lar­gest in the main directory structure and move the rest to the backup directory structure. I am not quite there yet…

      The last (close to) eight hours I have been using rsync, wit­hout the update option, to trans­fer all (rele­vant, in this directory structure) files from the NAS to my desk­top. With a 38 GB image repo­si­tory this is bound to take some time. (I am cur­rently at March last year.)

      Next step could eit­her be run­ning fslint to find obvi­uos dupli­ca­tes and delete them, or run­ning my script men­tio­ned above that will con­vert any and all files to lower­case while making a backup of any files that would be in the way for the for­mer. In know, fslint can scan more than one place, so I could move the extra files first, if I want.

      Also, the script above is not per­fect, I rewrote it and hope this meant some improvement.

    • Rsync stats, no joke:

      Num­ber of files: 28064
      Num­ber of files trans­fer­red: 19497
      Total file size: 37.66G bytes
      Total trans­fer­red file size: 28.39G bytes
      Lite­ral data: 22.68G bytes
      Mat­ched data: 5.71G bytes
      File list size: 562.58K
      File list gene­ra­tion time: 0.004 seconds
      File list trans­fer time: 0.000 seconds
      Total bytes sent: 44.57M
      Total bytes recei­ved: 22.35G

      sent 44.57M bytes recei­ved 22.35G bytes 753.52K bytes/sec
      total size is 37.66G speedup is 1.68

    • I have been explo­ring for a bit for any high-quality articles or
      weblog posts on this sort of area . Explo­ring in Yahoo I at last stum­bled upon
      this web site. Study­ing this info So i am glad to exhi­bit that I have an incre­dibly good uncanny fee­ling
      I found out exactly what I nee­ded. I most defi­ni­tely will
      make sure to don?t put out of your mind this
      web site and give it a glance regularly.

    • Unques­tio­nably beli­eve that which you said. Your favou­rite justi­fi­ca­tion appe­a­red to be at
      the web the simplest thing to remem­ber of.
      I say to you, I defi­ni­tely get annoyed even as
      folks con­si­der wor­ries that they plainly do not recog­nise about.
      You mana­ged to hit the nail upon the top as smartly as out­li­ned out the entire
      thing wit­hout having side-effects , folks could take a sig­nal.
      Will likely be again to get more. Thanks


    Leave a reply

Corporate