Sunday, March 7, 2010

merge pdf files

#cat pdfs page 1s to new file
cd ~/fay/2001/2001
mv A73557033.pdf ~/
pdftk A=~/A73557033.pdf cat A1 output ~/tmp.pdf
for i in $(ls ~/fay/2001/2001)
do
pdftk A=~/tmp.pdf B=$i cat A1-end B1 output ~/tmp1.pdf
mv ~/tmp1.pdf ~/tmp.pdf
done ### This could be condensed into a "one-liner" if desired.



Download database pdf files:
wget -F -np -r -l 2 -nd -nv -nc nH -i a.html
Since the pdf files downloaded are not with .pdf suffix,
have to find the pattern and rename to pdf.
In this case it is with the pattern
"ste=5&docNum=A12345678$"

for i in $(grep 'ste=5&docNum=.[0-9]\{7,\}$' ./files)
do
a=`echo $i | cut -d= -f5`
echo $a
cp $i $a.pdf
done

No comments:

Post a Comment