Merging Two Files with AWK: A Guide
To merge two files using awk
, you can use different strategies depending on whether you want to combine columns
(e.g., join files based on a common key) or append rows (e.g., concatenate files vertically). Below are practical examples for both scenarios:
1. Merge Two Files Line-by-Line (Combine Columns)
If both files have the same number of lines and you want to merge them side by side (like paste
), use awk
to read both files sequentially.
file1.txt:
Apple Banana Cherry
file2.txt:
100 200 300
AWK Command:
awk 'NR==FNR {a[NR]=$0; next} {print a[FNR], $0}' file1.txt file2.txt
Output:
Apple 100 Banana 200 Cherry 300
Explanation:
NR==FNR
: True while processing the first file (file1.txt
).a[NR]=$0
: Store each line offile1.txt
in an array.next
: Skip to the next line (prevents processing the second file yet).print a[FNR], $0
: Forfile2.txt
, print the line fromfile1.txt
with the current line.
2. Merge Two Files Based on a Common Key (Join Columns)
Example Files:
users.txt (Key: Column 1):
101 Alice 102 Bob 103 Charlie
scores.txt (Key: Column 1):
101 85 102 92 103 78
AWK Command:
awk 'NR==FNR {user[$1]=$2; next} $1 in user {print $0, user[$1]}' users.txt scores.txt
Output:
101 85 Alice 102 92 Bob 103 78 Charlie
Explanation:
- Store names from
users.txt
using column 1 as the key. - Then read
scores.txt
and match lines based on the same key.
3. Merge Files with Different Delimiters
If your files use different delimiters (e.g., one uses commas and the other tabs), use -F
and split
to specify them.
file1.csv:
A,Apple B,Banana C,Cherry
file2.tsv:
A Red B Yellow D Green
Command:
awk -F',' 'NR==FNR {data[$1]=$2; next} {split($0, a, "\t")} a[1] in data {print a[1] "\t" a[2] "\t" data[a[1]]}' file1.csv file2.tsv
Output:
A Red Apple B Yellow Banana
4. Merge and Append Rows (Concatenate Files)
If you want to stack two files vertically (like cat
):
awk '{print}' file1.txt file2.txt
Output:
Apple Banana Cherry 100 200 300
Key Notes
- Memory Usage: Large files may consume more memory when stored in arrays.
- Sort First: If your keys aren’t sorted, use
sort
before merging.
Sort Example:
sort file1.txt > sorted_file1.txt sort file2.txt > sorted_file2.txt
Use join
for Simplicity:
join -1 1 -2 1 users.txt scores.txt
Output:
101 Alice 85 102 Bob 92 103 Charlie 78
When to Use awk vs. Other Tools
- awk: For complex merging, filtering, or transformation logic.
- paste: For simple side-by-side merging of lines.
- join: For fast, key-based merging of sorted files.
Comments
Post a Comment