grep2awk: A small zsh/zle helper
While trying to find the needle in a haystack, you find yourself recklessly grepping some log files. Suddenly, it occurs to you that there might be a pattern in the data, and awk will be the fastest way to figure out if this pattern has any relevance or not. You want to change your grep into an awk oneliner.
This involves some mechanical work: Arrow up to get to the command line, move to the word grep and change it, forward to the start of the regular expression and add '/. Move to the end of the regular expression, and type: / {}'. Not a big deal, but mechanical work, which does add up if you're doing this eight times a day.
For this slight inconvenience, the tool grep2awk was written. It finds the first occurrence of the word grep in the current command line, and tries to convert the options and the regular expression into a skeleton for an awk-script. Just press a key you have chosen yourself, and you're already past the point of potential distraction which the mechanical work can entail.
How to use:
Clone the repository someplace:
git clone 'https://github.com/joepvd/grep2awk.git'
Then put the file grep2awk somewhere in your $fpath. Make sure the file gets autoload-ed, making the script known as a line editor (zle) script, and assigning a key binding to it:
autoload -Uz grep2awk
zle -N grep2awk
bindkey "^X^A" grep2awk
Now, pressing <CTRL-X>-<CTRL-A> will bring you goodies!
The following grep options are supported:
- -v
- inverse match
- -w
- word match
- -x
- line match
- -l
- list matching files
- -L
- list not matching files
- -H
- include filename in result
- -n
- include line number in result
- -c
- count occurrences per file
- -i
- case insensitive matching
- -E
- Extended Regular Expressions
- -F
- Fixed string matching
Development
If you source the file init.zsh, the development version of grep2awk will be made available under key binding <CTRL-P>. Handy for quick testing.
There is a testing library in the t-directory, in which the testing framework from the ZSH-project has been adjusted to work with the currently installed shell. Please run and update the tests when playing with the code.
Bugs
There are some bugs. The conversion from Basic Regular Expressions (which bare grep uses) to Extended Regular Expressions (which egrep and awk use) has not been implemented. The treatment of backslashes in the conversion from Fixed String to Extended Regular Expression is not working. Furthermore, the context options (-A, -B, -C) are not implemented, as well as -o (only-matching). Some fuky stuff with snooping aliases and the (deprecated) environment variable GREP_OPTIONS is as of yet not implemented. Also, colorized output is not supported.
Please let me know whether you like it, and what could be better to support your needs!
zsh: learn while doing
What good are advanced and hard to remember features if there is no help available at a fingertip? Globbing in zsh might be the best thing since sliced bread, but if you don't know the details, you might as well be chewing on a brick.
So, why not organize in such a way so that the details are readily available? One of the many nice ideas from feh's excellent zsh configuration file does exactly that, but it looked ugly on a too small terminal, and it looks like quite some manual work to format the information in ~/.zshrc. So I made zsh-hints, a small helper program that turns a definition file into helpful hints.
Who can remember all those glob flags? Imagine wanting to see the Setgid, and world readable files in somewhere here in a subdirectory. Just hit <CTRL X><g> …
% print -l **/*<CTRL X><g>
… and you will be presented with this result:
% print -l **/*
/ ▶ directories
F ▶ non-empty directories (full)
. ▶ plain files
- ▶ executable plain files
l[-+]n ▶ link count
r,w,x ▶ owner (r)eadable (w)ritable,
e(x)ecutable files
A,I,E ▶ group re(A)dable wr(I)table,
(E)xecutable files
R,W,X ▶ world (R)readable, (W)ritable,
e(X)ecutable files
s,G,t ▶ setuid, setgid, sticky bit files
f[+=-]NNN ▶ files with access rights matching
+,-,= octal number
{U,G}NNN ▶ owned by effective (U)ser,(G)roup
ID
{u,g}NNN ▶ owned by user, group id `NNN`
{u,g}:name: ▶ owned by user, group name `name`
...18 hints omitted.
A quick glance will help you conclude that (.GW) is what you are looking for.
This example was made on a rather small terminal. That is why some of the explanations were wrapped, and were displayed with a secondary separator of just a space. It also did not fit in the vertical direction. An optional message notifying how much you are missing on is displayed.
The key file is really simple to make: It just assumes that the first space separates the key from the explanation. There is no way of putting a space in the key.
All the details of installation and configuration are covered in the README. Please let me know if it does or does not work for you.
Enjoy :)
Revisiting array-to-quote
Computing professionals are in some way similar to athletes and musicians: Continued practicing sessions makes them better at what they do. Interestingly, it seems that there is not so much a culture of inconsequential coding sessions to sharpen your skills. Dave Thomas practices what he preaches with Code kata's.
The nice part about repetitive tasks, is that there is a reason to automate 'em. And when the automated solution starts to feel like a drag, it's time to improve the solution. Repetitive tasks lend themselves very well to katas. You probably know the corner cases, and have some ideas of what would suit your work flow. Continued study of the mechanics of simple manipulations really can get you forward. As an addition to the metaphor of kata's, I'd like to propose to think about the wood worker sharpening his tools.
That was more than enough meta for an introduction to pretty down-to-earth stuff. As I regularly have to shape one output in a quoted, comma separated list, I came to revisit the solution posted in my first post. This time, I set myself to make a keyboard shortcut that operates on the closest word to the cursor in a terminal running zsh.
Just drop this part in your ~/.zshrc, and you should have that short moment of gratification when it seems that the computer is obeying your will at the gentlest of gestures:
array-to-quote() {
autoload -U modify-current-argument
modify-current-argument '$(
if [[ ${(Pt)ARG} = "array" ]]; then
print ${(j., .)${(qq)${(P)ARG}}}
elif [[ -r $ARG ]]; then
print ${(j., .)${(qq)${(f)"$(<$ARG)"}}}
else
print ${(qq)ARG}
fi
)'
}
zle -N array-to-quote
# Terminal: Stop stealing CTRL-q and CTRL-s!
stty start '^-' stop '^-'
bindkey "^q" array-to-quote
If you copy-paste this into your zsh, and press CTRL-q, the word under your cursor will become quoted. If that word happens to be a file, the lines of that file will appear as a quoted list. If the word i an array, the elements are added nicely quoted and comma separated. Nice huh?
So, what is actually happening in these few lines?
First, a function is defined, that uses a function usually distributed with zsh: modify-current-argument. This function takes a function as an argument, and as a bonus, this function can use the variable $ARG: the word under or left of the cursor. Have a look at man zshcontrib for a complete description.
The function that modify-current-argument calls, make use of parameter expansion flags, and you should be able to follow along after reading my earlier post on the topic.
Now we have a function that takes the word under the cursor as an argument. This we need to have as a keyboard shortcut. zle -N function-name does the first part: It makes the function function-name availabe as a command line editing widget. This widget is in turn bound to CTRL q with the bindkey statement. There happened to be something special about CTRL-Q: I needed to tell the terminal driver to not listen to a functionality that I have never knowingly used with the stty-command. Please tell me if I miss out on cool stuff...
But wait a sec. There is this other thing that I regularly do: To make a regular expression from some kind of list. This is what does that trick, and binds it under CTRL \:
array-to-pipe() {
autoload -U modify-current-argument
modify-current-argument '$(
if [[ ${(Pt)ARG} = "array" ]]; then
print ${(j.|.)${(P)ARG}}
elif [[ -r $ARG ]]; then
print ${(j.|.)${(f)"$(<$ARG)"}}
else
print ${ARG}
fi
)'
}
zle -N array-to-pipe
bindkey '^\' array-to-pipe
Keep sharpening your knives :)
readline and psql
One of the nicer feautures of PostgresQL is its client: psql. This nifty console application has builtin readline support. As I am spending so much time in psql sessions, it is worth to learn how to abuse the readline library for some key board magic.
Readline is what you are actually working with (or against!) in an interactive Bash session. ZSH has its own infrastructure for interaction, but there are quite some applications besides Bash which make use of this interaction infrastructure. A basic list of readline keyboard shortcuts has been compiled at this page, and most of these will be in the muscle memory of console warriors. Suffice to say that this is just a part of the default configuration, see man 3 readline for some more possibilities.
If you find that you are typing often the same thing in a readline supported application, you might be interested to hear that Readline kindly offers to do the typing for you. These are called keyboard macros, and are configured in readline's configuration file: ~/.inputrc.
The following is very useful, almost taken directly from man 3 readline. If you put this in ~/.inputrc, pressing Alt-q will put single quotes around the word currently under the cursor:
"^[q": "^[b\'^[f\'"
This looks cryptic, so let's have a closer look. Before : is the key combination that triggers the action after the colon. ^[q translates into Alt-q, so that will trigger the action.
But how do you know that ^[q happens to represent Alt-q? The good news is that you don't, when you use the following method in vim to generate the escape code. In insert mode, press CTRL-V. This will tell vim to put literally what comes next. If you press Alt-q directly after CTRL-v, this sequence will show up, and that is all there is to it. You will notice, that when moving the cursor over the ^[, the cursor will jump over the two signs as if it is one. That is because it _is_ one character. Copy pasting these escape sequences into your editor most likely will not work.
So what about the action after the colon? This is a sequence that you can type literally in psql. Press Alt-b (bound to backward-word), then type a single quote (which in ~/.inputrc needs to be escaped by a backslash). Continue by pressing Alt+f, which brings you to the end of the word, where another single quote is inserted. Result: The current word is quoted. With this configuration in place, you just need to press one key combination, instead of needing complicated eye-hand coordination to move cursor to where it should be.
As I am working with a horribly normalized datamodel where finding what one needs can take an icredible amount of time, I often query the system tables from postgres to navigate in the database. A question that I keep repeating to ask, is: In which tables does this column name appear? A valuable indicator for the role of the table is the amount of information it contains. The following query helps me a lot to quickly find what I am actually looking for:
select distinct
relname as table,
reltuples as rowcount
from information_schema.columns as cols
join pg_class on pg_class.relname = cols.table_name
where column_name = 'name-of-column'
and reltuples > 0
order by reltuples;
It is a bit of a nuisance to need to edit this text often to enter the column that I am looking for. With the following line, I can type the column on the command line, and after pressing Alt-c (mnemonic: columns), the following happens:
=> colname<Alt-c>
=> select distinct
-> relname as table,
-> reltuples as rowcount
-> from information_schema.columns as cols
-> join pg_class on pg_class.relname = cols.table_name
-> where column_name = 'colname'
-> and reltuples > 0
-> order by reltuples;
┌─────────────┬──────────┐
│ table │ rowcount │
├─────────────┼──────────┤
│ table1 │ 7 │
│ table2 │ 10 │
│ table3 │ 22 │
│ table4 │ 80 │
│ table5 │ 126 │
│ table6 │ 13460 │
│ table7 │ 50112 │
└─────────────┴──────────┘
(7 rows)
The following, admittedly very ugly, line in ~/.,inputc makes this possible:
"^[c": "\C-a\C-kselect distinct\n relname as table,\n reltuples as rowcount \nfrom information_schema.columns as cols \njoin pg_class on pg_class.relname = cols.table_name \nwhere column_name = '\C-y'\n and reltuples > 0 \norder by reltuples;\n"
- CTRL+a:
- The cursor is moved to the start of the line.
- CTRL+k:
- Everything starting from the cursor until the end of the line is killed, or deleted and stored in the kill buffer.
Then, a lot of typing happens, including newlines, and some spacing. Until just after the first quote.
- CTRL+y:
- The contents of the kill buffer is yanked in place
...and some conluding typing happens.
This keyboard shortcut does not make a lot of sense in any other program than psql. Luckily, you can select the program for which this shortcut is available by encapsulating the configuration as follows:
$if psql
# some configuration
$endif
This way, you can reprogram all your keys, including function keys!, to do something useful.
less: a love story
Less. I use it more and more. I would not be surprised if at least a fourth of my time in a console window actually is spent within less. It is just a pager, you say? True, but one that has been in active development since 1983. The original developer currently is still the maintainer. You don't need to be an experimental archeologist to put this tool to use. Getting acquainted to this pager will pay itself off pretty fast.
If you haven't done so yet, have a look at man 1 less. The synopsis looks like Sesamestreet episode with brain damage:
less [-[+]aABcCdeEfFgGiIJKLmMnNqQrRsSuUVwWX~]
It might appear that this overabundance of seemingly unneeded features could serve as a practical definition of bloatware. These are just the options without arguments. Most of the same options can be used from within the application itself. Once you notice that you like some options, you can make those default. There are some true gems hidden in the following alphabet soup from the synopsis. I will show which ones I regularly use, but only after having a close look at the searching capabilities of less.
Effective use of less implies that one can use its searching possibilities. Striking the / initiates a search, and enter concludes it. Use n to go to the next match, N for the previous. Searching backward is started off with ?.
Search handles by default regular expressions. With some terminal escape codes set up, less will highlight the matches. I often type a string like this as a make-shift highlighter:
/someID=[0-9]+|error|(STATE|button).*$
Less stores the searches in ~/.lesshist, or in any file indicated by the environment variable $LESSHISTFILE. After starting typing /, the up and down arrows are available to use previous searches.
A search can also be started from the shell. When I already know what to look for, I often use this, also to leave traces of what I am interested in in the shell history. Chances are that I already used this search term in the shell, or will in the near future. Next to helping myself to use zsh's history completion feature, this has proven to be a real time saver when I need to revisit the same problem again. A typical use is when I want to know what happened at a specific time:
% less +/^16:34 logfile
This will open the file, and jump immediately to the first line with that time stamp. Less will also store this search in its history file. Unfortunatelty, there is no mechanism to search the past searches.
The most recent search term has a special role when pressing ESC-F. New content of the file will be displayed as with tail -f or the normal F-function, but the scrolling will stop with the first match of the most recent search.
A lesser known search facility is the one initiated by &. This filters the output to just the matching lines. I use this often when looking at the output of some command: One can refrain from running the command again only to use grep to find the interesting lines. A renewed search is done on the whole output, so one cannot apply a filter on a filter. Specifying lines that do not match the expression (grep -v), can be initiated by typing &!. After having refined the search pattern, the result can be saved with s filename.
The &-filter interacts with the search history. As I regularly have to dig in a logfiles where the first encounter is really helped by such a filter, I set up an alias like this:
alias lf="less +$'&event1|action2|state3|error\n' "
With this less-filter I can quickly get a birds eye perspective on the events. The filter can be disabled by an empty filter, so by pressing &[enter]. and can easily zoom in, and have this search string ready for poor man's highlighting. A quick -N to display line numbers, and typing the number of where I want to start to look, followed by g, brings me immediately to where I want to be.
When less is running, the following keys I use most:
- G
- Go to end of file. Before that, check if the file has changed. Great for viewing log files.
- g
- Go to beginning of file, or, when a number is pressed first, go to that line number.
- ESC-u
- Undo search highlighting.
- F
- Follow file. It's like tail -f, but after the BREAK signal (CTRL+c), all the niceties of the pager are still intact. And: tail -f cannot do the line wrapping.
- Number%
- Type in 50%, and you'll see the middle of the file.
- -S[Enter]
- Toggle line wrapping.
- -i[Enter]
- Toggle case insensitive search.
- -N and -n
- Enable and disable line numbers.
Despite the numerous options, I am not using so much more in my day to day usage of less. Some options can be specified on the command line. The following I consider nice:
- -R
- For when the file contains raw characters, for example ANSI color codes. This causes less not to choke on colored output, or to see the garbage in a binary file, instead of an error message.
- -X
This causes the screen not to be repainted after less exits. After viewing a file or man-page, I find it increadibly useful to be able to see the last screen in my terminal emulator's scroll back window. Just make the part visible that is of interest, and you have the reference visible when typing your command. In most terminal emulators, Shift+PgUp/PgDown gets the relevant info back again.
This setting makes most sense when the output is not too long, or if you know you'll be able to get to the important points in few jumps. To disable this feature, use -+X as an option.
It is unfortunate that everything that has been displayed in less is retained in the scrollback. It would be more convenient to only have a screen full of output in the console, especially when needing to browse insane amounts of text. I have not yet managed to fix this.
- -M
- Get a long prompt in less. Useful info about filename, length of file, and current position.
To have less always use the same options that you like, it listens to the $LESS environment variable in your ~/.bashrc or $ZDOTDIR/.zshrc. I have set mine to the following:
export LESS='MSRiX'
These letters have the following effect: long prompt, chop long lines, raw, case insensitive search, keep output in scrollback buffer.
Possibly, your distribution, or you yourself, has set up some nice colors to display different colors in man-pages. I have the following escape codes somewhere in the startup files of my shell:
export LESS_TERMCAP_mb=$'\e[01;31m' # begin blinking
export LESS_TERMCAP_md=$'\e[01;38;5;74m' # begin bold
export LESS_TERMCAP_me=$'\e[0m' # end mode
export LESS_TERMCAP_so=$'\e[38;5;070m' # begin standout (info box, search)
export LESS_TERMCAP_se=$'\e[0m' # end standout-mode
export LESS_TERMCAP_us=$'\e[04;38;5;146m' # begin underline
export LESS_TERMCAP_ue=$'\e[0m' # end underline
export MAN_KEEP_FORMATTING=1
The last entry might not be clear. When piping the output of man to less, the colors are not retained. Unless the MAN_KEEP_FORMATTING environment variable has a value.
The same trick I use in Something that I often use if I want to see : man less | less +/'^ *-X'. This jumps directly to the place where -X is in front of a line, possibly with some spaces. Most of the time, it just jumps to the explanation that you want to see, otherwise press n to jump to the next hit.
Parameter expansion in zsh
The shell is a high quality text processor, and zsh is especially suited for that purpose. In this post, I will show some of the tricks I use for an editing problem that I encounter every day. Concretely: How to quickly generate queries, after receiving a list of IDs. This is what the result should look like:
select * from table where ID in ('1234', '2345', '3456');
A fast way to achieve this looks thusly:
% ar=(
array> 1234
array> 2345
array> 3456
array> )
% print "select * from table where id in (${(j:, :)${(qq)ar[@]}});"
select * from table where id in ('1234', '2345', '3456');
What is happening here? First I am creating an array of something I have available in my paste buffer: I type the name of an array, =, and (, and press enter. Then I paste a list of IDs, close it off with a ). Now the info is available at my fingertips. The second action, the print statement with the hardly memorable parameter expansion, is the main topic of this post.
Shells provide convenience functions to do stuff with parameters, and zsh surely is the most advanced in this regard. These parameter expansions are easiest to read from the inside to the outside, so let's have a look at ${(qq)ar[@]}. This consists of two parts, ${ar[@]} and (qq).
The result of ${ar[@]} is into all the elements of array ar. In any shell that conforms to POSIX, you can specify elements of an array by encapsulating them in square brackets: ${ar[2]} would be 2345. One can use the @ to say that you want all the indexes.
In zsh, if there is an opening bracket directly after the curly opening bracket, magic is immanent. The bracketed characters are flags. For a complete overview of what can be done, see man zshexpn | less +/'^ *Parameter Expansion Flags'. In this case, the members of ar are treated with the action that is hiding behind the qq. The effect of this flag is quoted with single quotes. (You can use three q's for double quotes).
So the net result of the inner expansion is a copy of the arrat ar, with the difference that the elements are quoted. This is the intermediate result what the outer expansion, ${(j:, :)…} is working with. The flag j is for joining the elements of an array, with whatever is between the colons as a separator, in our case a comma followed by space. The colons are arbitrary: If your join string contains colons itself, you can take a comma or a period, or whatever.
The result is, as you have seen, that ${(j:, :)${(qq)ar[@]}} is expanded to a comma separated line of quoted elements of array ar. As I use this kind of expansion on a daily basis, and this expansion is a bit too tedious to type in every time, I spent a bit of time to make this expansion available at my finger tips in the form of a shell function a2q:
a2q () {
print ${(j:, :)${(Pqq)1}[@]}
}
This can be used as follows:
% print "select * from table where id in ($(a2q ar));"
Compared to the interactive version, the array is a positional parameter: $1 is being expanded, so you can type: a2q myarray to have a2q work on myarray. In order to make this work, an extra trick needed to be added: The P-flag has been added to the inner parameter expansion flags. This makes that the resulting string is considered to be a parameter.
This works great, but it can be generalized. Sometimes the list of IDs is given as an file. By virtue of the f-flag, the following snippet loads the newline separated contents of file file into array array:
array=( ${(f)$(<file)} )
Just to facilitate laziness, the function a2q can be expanded to check what type of argument it got, and based on that, populate a temporary array with infos. Multiple arguments are allowed. If no argument is provided, a2q will listen to STDIN, so you can pipe the output of another command to it. After all the processing, the last step is to print the contents of the array in the desired way:
emulate -L zsh
typeset -U ar
ar=()
_err() {
echo "Do not understand: ${1}" >&2
echo "Arguments need to be files, names of arrays, or standard input." >&2
echo 'Arrays must be referenced by name, so use `array` instead of `$array`.' >&2
exit 1
}
if [[ $# == 0 ]]; then
# Listen to STDIN if no arguments are provided
ar=( ${(f)$(<&0)} )
fi
while [[ $# > 0 ]]; do
if [[ -r "$1" && -f "$1" ]]; then
# A readable file.
ar=(
$ar[@]
"${(fq)$(<$1)}"
)
elif [[ ${(Pt)1} = "array" ]]; then
ar=(
$ar[@]
${(Pq)1}
)
else
_err
fi
shift
done
print ${(j:, :)${(qq)ar[@]}}
I stored the above as a2q in a directory whose contents gets autoloaded when required, so I have it available at my finger tips.
Enjoy!
About this website
Hello world! This is the first post to my site, which will double as the about page. Yet another site dedicated to consoles, shells, and old school interfaces to solved problems? Isn't there enough documentation and experience out there to warrant another site?
As with all good questions, the answer is both yes and no.
Yes. I proudly want to produce new content as a tribute to all the valuable and intricate pieces of information that have been shared before. I have learnt an awful lot from excellent people documenting informally how they overcome smaller and bigger technological hurdles while accomplishing stuff. And I have enjoyed doing so. A second reason for this resounding yes, is the hope that processes of documenting and publishing trivia, intricacies and insights will assist me in the battle against forgetfulness.
But I fancy to think that also that also No is a valid answer: This site will have a distinctive perspective. One thing where I find current informal, online documentation lacking, is in the description of interactive usage of tools. Much of the online documentation is a static description of how tools work, whereas how those tools can be used efficiently in a work flow of analyzing and fixing problems. If this were a blog about cooking, I would try to refrain from showing the ingredients and knives, and rather focus on how to cut and fry an onion to acchieve different results.
I have a fetish for basic tools. Be it a sharp knife in the kitchen, or the precise power tools that the One Thing Well software philosophy has brought about. The process of interactive discovery of the real problem is where the flexibility of the software tool philosophy really shines. In this blog I hope to provide some useful perspectives, approaches and recipes about the intersection of basic tools and work flows.
This site is statically generated with Pelican, and I don't feel the need to take care of comments. I do not want to imply that I do not like feed back, but prefer to get spam in my mail box rather than on the public net that I need to maintain. For now, for praise and bug reports, you can reach me best over email. I am sure you can puzzle my address together from the following bits of information:
user: escape-code
host: xs4all
tld: nl
Enjoy reading!