scribbu ¶

The extensible tool for tagging your music collection.

This manual corresponds to scribbu version 0.7.0.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

A copy of the license is also available from the Free Software Foundation Web site at https://www.gnu.org/licenses/fdl.html.

This document was typeset with GNU Texinfo.

1 Introduction
- 1.1 scribbu
- 1.2 Project Background
2 The scribbu Program
3 Scripting scribbu
4 Using libscribbu
- 4.1 Building a libscribbu program
  - 4.1.1 Implementing az-tags
  - 4.1.2 building az-tags
Frame Identifiers
Character Encodings
Function Index
Index

Next: The scribbu Program, Previous: scribbu, Up: scribbu [Contents][Index]

1 Introduction ¶

scribbu is a C++ library & associated command-line tool for working with ID3 tags (see ID3). It was born when I retired my last Windows machine & could no longer use Winamp (see Winamp) to manage my library of digital music. The scribbu library offers classes & methods for reading, modifying & writing ID3v1 & ID3v2 tags. The scribbu program provides assorted sub-commands for working with ID3-tagged files (e.g. re-naming files based on their tags), but its real power lies in its embedded Scheme interpreter (see The Guile Reference Manual) in which scribbu library features are exported as a Scheme module (on which more below).

scribbu
Project Background

Next: Project Background, Up: Introduction [Contents][Index]

1.1 scribbu ¶

The scribbu project has a few components. The first is a program that provides assorted sub-commands, a few of which are:

scribbu dump will write the contents of any & all ID3 tags found in one or more files to stdout. See Invoking scribbu dump.
scribbu report will generate a report listing ID3 attributes on one or more files on stdout. CSV, TDF & ASCII-delimited formats are supported currently. See Invoking scribbu report.
scribbu rename will rename one or more files based on the contents of their ID3 tags; e.g. scribbu rename -t ``%A-%T.mp3'' *.mp3 will rename all the files matching “*.mp3” to “<artist>-<title>.mp3” where "artist" and "title" are derived from their ID3 tags (if any). See Invoking scribbu rename.
scribbu popm will update ID3v2 play count & popularimeter tags. For instance, scribbu popm foo.mp3 will increment the play counts in “foo.mp3”. See Invoking scribbu popm.
scribbu text maintains assorted ID3v2 text frames; for instance, scribbu text --artist='Roxy Music' *.mp3 will set the artist frame to “Roxy Music” in all ID3v2 tags in all files matching “*.mp3”. See Invoking scribbu text.

Any sub-command can be invoked with --help or -h for more information. Use --info option to display the command’s node in this manual.

The scribbu program also exports functions & GOOPS (see GOOPS) classes to a Scheme interpreter, so scribbu can also be invoked...

with a Scheme expression (-e, --expression) or Scheme file (-f, --file). In this case, scribbu will evaluate the given program & exit.
with no arguments at all. In this case, scribbu will drop into a Scheme REPL (Read Evaluate Print Loop) in which the user can evaluate arbitrary Scheme expressions.

as a script:

#!/usr/local/scribbu \
-e main -s
#!
(define (main args)
    ...

Finally, scribbu contains a C++ library (libscribbu, see Using libscribbu) against which one can build C++ libraries & programs.

Previous: scribbu, Up: Introduction [Contents][Index]

1.2 Project Background ¶

Some background on MP3, ID3, Winamp & the genesis of this project.

MP3
ID3
Winamp
Today

Next: ID3, Up: Project Background [Contents][Index]

1.2.1 MP3 ¶

Widespread digital encoding of music arrived with the introduction of the compact disk in 1982. However, the size of the resulting digital representation was large: the standard Compact Disk stored about one hour & twenty minutes worth of music in about seven hundred MiB (at a time when the typical hard drive could hold ten MiB). In 1989 the relevant standards body (the Moving Picture Experts Group, or MPEG) called for proposals for lossy audio compression algorithms. The fourteen propsals they received were eventually combined into three “layers”, each with a different set of trade-offs between quality, space, and computational complexity. “MPEG Audio Layer I” was the simplest, designed to enable real-time encoding on the hardware of the day. “MPEG Audio Layer II” provides higher quality than Layer I but offers computationally simpler decoding than Layer III. “MPEG Audio Layer III” (or MP3) provides good quality at lower bitrates than Layer II, albeit at the cost of greater computational complexity.

Layer three was primarily developed by the German company Fraunhofer IIS. The file extension .mp3 was selected as a result of an internal survey of researchers at Frauhofer. At a sampling rate of 128kbits/sec, MP3 needed about a megabyte per minute of music encoded; nearly one-tenth the size of CD audio.

At one MB per minute, given the size of consumer hard drives in the nineties, home users could easily store many MP3 tracks. The format found such universal application in the portable digital music players becoming available that they came to be known as “mp3 players”. With the network bandwidths available at the time, one could conveniently transmit MP3-encoded files across the internet, and even stream them.

Typically of technological history, the application responsible for the widespread adoption of MP3 was not the application for which it was designed. Applications for audio encoded by MP3 were intitially thought to be “musical transmission over ISDN telephone lines” and “voice announcement systems for local public transport”. Instead, the medium of choice for digital music became the ‘.mp3’ file.

Next: Winamp, Previous: MP3, Up: Project Background [Contents][Index]

1.2.2 ID3 ¶

A problem quickly emerged: the MP3 standard included no provision for metadata; no way to “tag” an .mp3 file with information such as title, artist, et cetera. NamkraD (AKA Eric Kemp) is credited with the idea of attaching such a tag to .mp3 files in 1996. Presumably to make it easy to detect & parse, while not interfering with existing decoders, it had a fixed size of one hundred twenty-eight bytes, and was attached to the end of the file (if a player that was unaware of the tag played the enclosing file, at worst the user would hear a bit of static at the end). It provided for a thirty-byte title, artist & album along with year, comment & a one-byte genre field. The original proposal defined eighty genres, extended to 148 by 1.91 release of Winamp (see Winamp) in June 1998 and to 192 by the 5.6 release of Winamp in November 2010.

The limitations of this format quickly became aparent, leading to the proposal in 1998 of ID3v2 by Martin Nilsson and several other contributors. Although it shared a name, ID3v2 was a completely different approach to tagging music: it was prepended to the audio data (making it suitable for streaming media) and it was variable-length; ID3v2 tags are comprised of multiple frames, each containing one piece of information about the music (title, artist &c).

Next: Today, Previous: ID3, Up: Project Background [Contents][Index]

1.2.3 Winamp ¶

Space-efficient, high-quality, tagged audio was no good without a ready means of listening to it. The then-existing Windows Media Player and Real Networks’ Real Player never found widespread adoption. In April 1997 Justin Frankel and Dmitry Boldyrev released Winamp, a small, performant Windows MP3 player. Frankel formed Nullsoft in January 1998. With version 1.5, Winamp changed from freeware to shareware & charged a ten dollar registration fee; far from dampening uptake, this brought in $100,000 a month from $10 paper checks in the mail from paying users. Winamp 2.0 was released in September 1998 & became one of the most downloaded Windows programs ever.

One of the things that endeared Winamp to its users was its plugin architecture. Nullsoft provided several plugins as part of the standard distribution, one of which was the Music Library. Using this, one could manage, organize, search & play a personal library of thousands of MP3 files, all based on ID3 tags (see ID3).

Nullsoft was (in)famoulsy acquired by AOL in 1999. By 2000 Winamp had been registered twenty-five million times, but Nullsoft began to struggle with the propblems of so many AOL acquisitions. 2002 saw the misbegotten release of Winamp 3, a complete re-write that broke with the prior ethos of tight, lightweight code. Widespread incidence of users (including the author) reverting to Winamp 2 in response to poor performance & high resource demands of Winamp 3 led to Nullsoft continuing 2.x development, and eventually the release of Winamp 5 (2+3) late in 2003. From version 5.2, Winamp provided the ability to sync the user’s library with iPods, which led to many iPod owners’ (again including the author) choosing to use Winamp instead of iTunes to manage their devices.

The original Winamp team quit AOL in 2004 & development moved to Dulles (VA). Work continued, albeit at a slower pace. With the release of Winamp 5.66 in late 2013, AOL announced that winamp.com would be shutdown later that year and that the software would no longer be availble for download. It was later announced that Nullsoft (along with Shoutcast, an MP3 streaming platform) had been sold to the Belgian company Radionomy. As of the time of this writing, winamp.com is up, and offering a download of Winamp 5.8 (beta) from Radionomy.

Previous: Winamp, Up: Project Background [Contents][Index]

1.2.4 Today ¶

It is a credit to Winamp that it remained usable well into the twenty-teens as a way to mange large libraries of .mp3 files. Winamp is not quite dead, but it is stranded on an operating system that I have left behind (along, I suspect, with many other technically-inclined music aficionados today). The MP3 format itself is showing its age; Fraunhofer IIS announced in 2017 that it was ending its licensing programs for MP3. AAC is now the standard for digital music.

And yet, I have several thousand .mp3 files in my personal library. Since both MP3 and AAC are lossy formats, transcoding them to AAC would not lead to good results even if I were inclined to do the work. The original sources of many of the .mp3s have been lost, so re-encoding to AAC is not possible.

Perhaps scribbu (see The scribbu Program) will support AAC in the future, but it seems that MP3 & ID3 will be relevant to my musical life for some time. I wrote this tool to help me manage them, and I offer it to anyone else in the same position: if you need to manage ID3-tagged .mp3 files, and especially if you enjoy hacking in LISP and/or C++, I hope you find scribbu useful and enjoyable.

Next: Scripting scribbu, Previous: Introduction, Up: scribbu [Contents][Index]

2 The `scribbu` Program ¶

The simplest way to use scribbu is through the command-line tool. For the scribbu command itself, as well as all scribbu subcommands, the -h flag will produce a brief help message on stdout, and the --help will display the corresponding man page. You can get a list of all the sub-commands scribbu provides by saying scribbu -h. You can display a given sub-command’s node in this Info manual by saying scribbu CMD --info.

Display this manual by saying scribbu --info.

Example
Invoking scribbu dump
Invoking scribbu encodings
Invoking scribbu genre
Invoking scribbu m3u
Invoking scribbu report
Invoking scribbu rename
Invoking scribbu popm
Invoking scribbu text
Invoking scribbu xtag

Next: Invoking scribbu dump, Up: The scribbu Program [Contents][Index]

2.1 Example ¶

Let us suppose we have a few ‘.mp3’ files which we have just downloaded, or have encoded some time ago & forgotten about. Regardless, we want to examine & update their tags before adding them to our library. The following chapters demonstrate this using scribbu sub-commands.

scribbu dump
scribbu report
scribbu popm
scribbu rename

Next: scribbu report, Up: Example [Contents][Index]

2.1.1 scribbu dump ¶

The simplest place to start is scribbu dump. This will show us what is in the tags:

$>: scribbu dump *.mp3
"lorca.mp3":
ID3v2.3(.0) Tag:
452951 bytes, synchronised
flags: 0x00
The Pogues - Lorca's Novena
Hell's Ditch [Expanded] (US Version) (track 5), 1990
Content-type Pop
TIT2: Lorca's Novena
TPE1: The Pogues
TALB: Hell's Ditch [Expanded] (US Version)
TCON: Pop
TCOM:
TPE3:
TRCK: 5
TYER: 1990
TPE2: The Pogues
COMM (<no description>):
Amazon.com Song ID: 203558254
TCOP: 2004 Warner Music UK Ltd.
TPOS: 1
frame APIC (115554 bytes)
frame PRIV (1122 bytes)
335921 bytes of padding
9425708 bytes of track data:
MD5: 48ff9cadea7d842e9059db25159d2daa
ID3v1.1: The Pogues - Lorca's Novena
Hell's Ditch [Expanded] (US Ve (track 5), 1990
Amazon.com Song ID: 20355825
unknown genre 255

"opium.mp3":
ID3v2.3(.0) Tag:
2038 bytes, synchronised
flags: 0x00
Stephan Luke - Opium Chant Intro
Opium Gardens (track 1), 2003
Content-type General Club Dance
Encoded by Winamp 5.552
TENC: Winamp 5.552
TRCK: 1
COMM (<no description>):
Ripped by Winamp on Pimpernel
TPUB: Opium Music
TPOS: 1/1
TYER: 2003
TCON: General Club Dance
TALB: Opium Gardens
TPE2: Opium Garden
TPE1: Stephan Luke
UFID: http://www.cddb.com/id3/taginfo1.html
      334344334e33395235383037313937335532343836364232394336314239364139333341424332363945364531454642444233445032
TIT2: Opium Chant Intro
1522 bytes of padding
1191874 bytes of track data:
MD5: 690194f49592c7d8ccfbfe8a157d4c1e
ID3v1.1: Stephan Luke - Opium Chant Intro
Opium Gardens (track 1), 2003
Ripped by Winamp on Pimperne
unknown genre 255

"orlando.mp3":
ID3v2.3(.0) Tag:
607 bytes, synchronised
flags: 0x40
Bill LeFaive - Orlando
http://music.download.com (track 1), <no year>
TALB: http://music.download.com
TIT2: Orlando
TIT3: http://music.download.com/
TPE1: Bill LeFaive
TCOM: Bill LeFaive
frame WOAF (1 bytes)
frame WPUB (27 bytes)
frame WXXX (54 bytes)
TRCK: 1
TCOP: 2006 Bill LeFaive
TPUB: http://music.download.com
256 bytes of padding
6549838 bytes of track data:
MD5: dd0c70e13d4aeec676f8d7a7bda622b0
ID3v1.1: Bill LeFaive - Orlando
http://music.download.com (track 1), 2004
http://music.download.com/
unknown genre 255

We see we have three files, all of which have both ID3v1 & ID3v2 tags. The output also contains some basic information on the MP3 data in between the tags.

scribbu dump takes as arguments one or more files and/or directories, and prints information about all files listed, or in the directories named, recursively. With no options, scribbu dump will dump everything it understands. With options, the output can be scoped in various ways (e.g. ID3v2 tags only, ID3v1 tags only, track data only, among other options). The output format can also be controlled; refer to the man page for a complete list (or see Invoking scribbu dump).

Next: scribbu popm, Previous: scribbu dump, Up: Example [Contents][Index]

2.1.2 scribbu report ¶

Another way to investigate files, especially large numbers of files, is scribbu report:

$>: scribbu report -o myfiles.csv *.mp3
$>: cat myfiles.csv
directory,file,file size(MB),ID3v2 version,ID3v2 revision,ID3v2 size(bytes),ID3v2 flags,ID3v2 unsync,ID3v2 Artist,ID3v2 Title,ID3v2 Album,ID3v2 Content Type,ID3v2 Encoded By,ID3v2 Year,ID3v2 Langauges,# ID3v2 play count frames,Play Count,# ID3v2 comment frames,comment #0 text,comment #1 text,comment #2 text,comment #3 text,comment #4 text,comment #5 text,size (bytes),MD5,has ID3v1.1,has ID3v1 extended,ID3v1 Artist,ID3v1 Title,ID3v1 Album,ID3v1 Year,ID3v1 Comment,ID3v1 Genrre
"/tmp/tut","lorca.mp3",9.421,3,0,452951,0x00,0,The Pogues,Lorca's Novena,Hell's Ditch [Expanded] (US Version),Pop,,1990,,0,,1,Amazon.com Song ID: 203558254,,,,,,9425708,48ff9cadea7d842e9059db25159d2daa,1,0,The Pogues,Lorca's Novena,Hell's Ditch [Expanded] (US Ve,1990,Amazon.com Song ID: 20355825,255
"/tmp/tut","opium.mp3",1.139,3,0,2038,0x00,0,Stephan Luke,Opium Chant Intro,Opium Gardens,General Club Dance,Winamp 5.552,2003,,0,,1,Ripped by Winamp on Pimpernel,,,,,,1191874,690194f49592c7d8ccfbfe8a157d4c1e,1,0,Stephan Luke,Opium Chant Intro,Opium Gardens,2003,Ripped by Winamp on Pimperne,255
"/tmp/tut","orlando.mp3",6.247,3,0,607,0x40,0,Bill LeFaive,Orlando,http://music.download.com,,,,,0,,0,,,,,,,6549838,dd0c70e13d4aeec676f8d7a7bda622b0,1,0,Bill LeFaive,Orlando,http://music.download.com,2004,http://music.download.com/,255

Running scribbu report (See Invoking scribbu report.) with an option of -o <name>.csv will produce an RFC-4180-compliant comma separated variable file reporting on the files given on the command line. Option -t will instead produce tab-delimited data. scribbu itself provides little beyond this in terms of reporting, the idea being that CSV or TDF output can be readily imported into other programs better suited to that task.

That said, one can do some basic querying at the command line, for which tab-delimited format can be convenient. For example, this little awk program will show the ID3v2 version for each file:

$>: scribbu report -t -o myfiles.tdf *.mp3
$>: cat myfiles.tdf | awk 'BEGIN {FS="\t"}; {print $2, $4}'
file ID3v2 version
"lorca.mp3" 3
"opium.mp3" 3
"orlando.mp3" 3

Next: scribbu rename, Previous: scribbu report, Up: Example [Contents][Index]

2.1.3 scribbu popm ¶

We notice that none of these files contain a PCNT or a POPM frame; let’s add them now:

$>: scribbu popm -a -o foo@bar.com -r oooo *.mp3
$>: scribbu dump *.mp3
...
PCNT: 0
POPM: foo@bar.com
rating: 179
counter: 00
...

The popm command can be used to manage both PCNT & POPM frames (see Invoking scribbu popm) The -a flag indicates that we want to create the relevant frames, if they don’t already exist. -r sets the rating for the POPM frame; you can provide an integer between 0 and 255, or use the “star” system. In this case, I’ve given all the files four stars. “****” would be more mnemonic, but inconvenient in the shell, so scribbu will recognize almost any character, repeated one to five times, as a “star”.

Having created the PCNT and/or POPM frames, one can update the play counts with a simple command; e.g.

scribbu popm opium.mp3

will increment the play count by one in any PCNT or POPM frames it finds. The operation can be scoped or modified in a number of ways, such as limiting it to only one or the other, or only to POPM frames with a certain owner– see the man page or Invoking scribbu popm for full details. The intent of the command is to enable players that don’t update the playcount or set ratings themselves, but which can be scripted or extended in some way, to do so.

Previous: scribbu popm, Up: Example [Contents][Index]

2.1.4 scribbu rename ¶

Finally, let us re-name the files based on their ID3 tags:

scribbu rename *.mp3

Will by default rename each file to “<artist> - <title>.mp3” with <artist> & <title> each derived from the corresponding ID3v2 frame (see Invoking scribbu rename). This can be customized by providing a “template”: text interspersed with replacement parameters to be filled in with tag contents. The parameters begin with a ‘%’, and each parameter has a one-character “short” form and a more descriptive “long” form. For instance, “artist” can be represented as either %A or %(artist). So the default template could be expressed as “%A - %T.mp3” or “%(artist) - %(title).mp3”.

If the long form is used, the action of the replacement parameter may optionally be modified by options given after the parameter name and a colon in “query-style”: opt0&opt1&opt2... where opti is in the form name=value or just name. For instance, if we wanted the artist to always be taken from the ID3v1 tag, and that field happens to use the ISO-8859-1 character encoding, we could say:

%(artist:v1-only&v1-encoding=iso-8859-1)

Let us accept the default settings, but see what would happen without actually re-naming anything:

scribbu rename -n *.mp3
"lorca.mp3" => "Pogues, The - Lorca's Novena.mp3"
"opium.mp3" => "Stephan Luke - Opium Chant Intro.mp3"
"orlando.mp3" => "Bill LeFaive - Orlando.mp3"

Before we rename the files, there is a lot more hygiene that could be carried out. “lorca.mp3” has a number of empty text frames that should be removed, “opium.mp3” has a comment frame with no owner, and the ID3v1 genre in all three is set to “255”.

As I developed scribbu, and began using it to manage my personal music collection, it became clear that providing a sub-command for every conceivable operation was not feasible. Furthermore, many of the things I wanted it to do were one-off tasks pertinent to a single file, or a handful of files, that weren’t worth formally coding up as sub-commands. What I really wanted was a way to “script” libscribbu (see Using libscribbu). I found my solution in Guile (see The Guile Reference Manual.), which is the topic of the next chapter.

Next: Invoking scribbu encodings, Previous: Example, Up: The scribbu Program [Contents][Index]

2.2 Invoking `scribbu dump` ¶

scribbu dump will walk each file and/or directory specified (recursing directories), read each file found, look for ID3 tags therein, and pretty-print what it finds to stdout.

scribbu dump Options

Up: Invoking scribbu dump [Contents][Index]

2.2.1 `scribbu dump` Options ¶

scribbu dump accepts the following options:

-1|--id3v1-tags display only ID3v1 tags
-D|--track-data display track data only
-2|--id3v2-tags display only ID3v2 tags
-i arg|--indent=arg indent all output arg spaces
-g|--no-expand-genre don’t attempt to expand the genre when specified as a numeric constant
-e regex|--expression=regex define a regular expression for filtering the files to be pretty-printed. For each file, its entire path will be matched against this regular expression before being pretty-printed (non-matches will be ignored).
-f FMT|--format=FMT specify the desired output format; at present, only two values are supported for this option: standard, which is the default, pretty-prints the selectd portions of each file to stdout with one attribute (artist, title &c) per line. A format of csv will print one line in CSV format per file.
The precise columns output in csv format will depend on the other options, but for ID3v2 tags include:

version, revision, size, flags, unsynchronised, artist, title, album, genre, encoded by, year, languages, play count, comments

for track data: size & MD5 checksum

for ID3v1 tags: v1.1, extended, artist, title, album, year, comment & genre.
-c ENC|--v1-encoding=ENC specify an encoding for ID3v1 tags (CP1252 by default). See See Character Encodings. for a complete list of the encodings supported and their identifiers.

Next: Invoking scribbu genre, Previous: Invoking scribbu dump, Up: The scribbu Program [Contents][Index]

2.3 Invoking `scribbu encodings` ¶

Several scribbu commands accept character encodings as options. It is not always clear what encodings are supported or how to specify them textually. This sub-command will print the list of supported character encodings along with their names for the user’s convenience.

That list is reproduced here:

ASCII
ISO-8859-1
ISO-8859-2
ISO-8859-3
ISO-8859-4
ISO-8859-5
ISO-8859-7
ISO-8859-9
ISO-8859-10
ISO-8859-13
ISO-8859-14
ISO-8859-15
ISO-8859-16
KOI8-R
KOI8-U
KOI8-RU
CP1250
CP1251
CP1252
CP1253
CP1254
CP1257
CP850
CP866
CP1131
MacRoman
MacCentralEurope
MacIceland
MacCroatian
MacRomania
MacCyrillic
MacUkraine
MacGreek
MacTurkish
Macintosh
ISO-8859-6
ISO-8859-8
CP1255
CP1256
CP862
MacHebrew
MacArabic
EUC-JP
SHIFT-JIS
CP932
ISO-2022-JP
ISO-2022-JP-2
ISO-2022-JP-1
ISO-2022-JP-MS
EUC-CN
HZ
GBK
CP936
GB18030
EUC-TW
BIG5
CP950
BIG5-HKSCS
BIG5-HKSCS:2004
BIG5-HKSCS:2001
BIG5-HKSCS:1999
ISO-2022-CN
ISO-2022-CN-EXT
EUC-KR
CP949
ISO-2022-KR
JOHAB
ARMSCII-8
Georgian-Academy
Georgian-PS
KOI8-T
PT154
RK1048
TIS-620
CP874
MacThai
MuleLao-1
CP1133
VISCII
TCVN
CP1258
HP-ROMAN8
NEXTSTEP
UTF-8
UCS-2
UCS-2BE
UCS-2LE
UCS-4
UCS-4BE
UCS-4LE
UTF-16
UTF-16BE
UTF-16LE
UTF-32
UTF-32BE
UTF-32LE
UTF-7
C99
JAVA

Next: Invoking scribbu m3u, Previous: Invoking scribbu encodings, Up: The scribbu Program [Contents][Index]

2.4 Invoking `scribbu genre` ¶

scribbu genre will set the genre (i.e. the TCON frame for ID3v2 tags and the genre byte for ID3v1) for all tags in all files named on the command line. If an argument is a file, operate on the tags in that file. If the argument is a directory, operate recursively on all files containing ID3 tags therein.

The genre can be specified in a few ways:

scribbu genre -w N

will interpret N as one of the genres defined by Winamp, specified as an integer between 0 & 191 (inclusive). Run scribbu genre -W to print a list of the Winamp genres.

scribbu genre -g GENRE

will attempt to map to map the string GENRE to one of the Winamp genres using Damerau-Levenshtein distance, but disregarding case. For instance scribbu genre -g rok will be interpreted as Winamp genre number seventeen "Rock".

scribbu genre -G GENRE

will accept GENRE uncritically as the TCON to be used for ID3v2 tags. ID3v1 tags, if present, will have their genre field mapped to one of the Winamp values again by case-insensitive Damerau-Levenshtein distance (or just set to 255 if that fails). To explicitly set the ID3v1 version when specifying genre in this way, add the --v1 flag (e.g. scribbu genre -G foo --v1 17).

This brings up the question of what to do if there is no ID3v1 and/or no ID3v2 tag(s) in a given file. By default, in the absence of a tag, nothing will be done (so if invoked, for instance, on a file with neither an ID3v1 nor an ID3v2 tag, this sub-command would do nothing). This behavior can be customized in two ways. If --create-v2 or --create-v1 are given, an ID3v2 (ID3v1, resp.) tag will be created for any file which already possess an ID3v1 (ID3v2, resp.) tag. If --always-create-v2 or --always-create-v1 are given, an ID3v2 (ID3v1, resp.) tag will always be created if it doesn’t exist.

scribbu genre Behavior on Write
scribbu genre Options

Next: scribbu genre Options, Up: Invoking scribbu genre [Contents][Index]

2.4.1 `scribbu genre` Behavior on Write ¶

If the command didn’t change the tagset for a given file, that file will not be modified. Typically however, the tagset will have been changed, and we need to write out the new tagset. Since ID3v1 tags are fixed-size blocks appended to the file, writing them out is trivial.

The default behavior for ID3v2 tags is to first try to emplace them; that is, write them over the current set of ID3v2 tags without touching track data, adjusting the padding if needed. If that is impossible (i.e. the new tagset can’t be fit into the space occupied by the old, even when adjusting padding) a full copy will be made by writing the new tagset to a temporary file, copying the track data from the extant file to the temporary file, and finally appending the ID3v1 tag, if any, to the temporary file. Only then is the temporary file renamed ot the original (which is hopefully atomic).

This behavior can be modified by the --create-backups option (see scribbu genre Options), which will create a backup of the original file before renaming the temporary file.

Previous: scribbu genre Behavior on Write, Up: Invoking scribbu genre [Contents][Index]

2.4.2 `scribbu genre` Options ¶

-n|--dry-run dry-run mode; only print what would happen
-u|--adjust-unsync adjust each ID3v2 tag’s use of the unsynchronisation scheme on write (by default, it’s never used)
-a|--always-create-v2 always create an ID3v2 tag with a TCON frame for any file that does not possess one (whether or not there’s an ID3v1 tag present); may be combined with --always-create-v1 or --create-v1, but not with --create-v2.
-A|--always-create-v1 always create an ID3v1 tag with the genre field set appropriately for any file that does not possess one (whether or not there’s an ID3v2 tag present); may be combined with --always-create-v2 or --create-v2, but not with --create-v1.
-b|--create-backups by default, the new tagset will be written in-place (emplacing ID3v2 tags, if feasible); this option will cause a backup file to be made before changing the original See scribbu genre Behavior on Write
-c|--create-v2 create an ID3v2 tag with a TCON frame for any file that has an ID3v1 tag but does not have an ID3v2 tag (fields from the ID3v1 tag will be copied over); may be combined with --always-create-v1 or --create-v1, but not with --always-create-v2
-C|--create-v1 create an ID3v1 tag with the genre field set appropriately for any file that has an ID3v2 tag, but no ID3v1 tag; may be combined with --always-create-v2 or --create-v2, but not with --always-create-v1
-g GENRE|--genre=GENRE specify GENRE as the textual name of one of the 192 Winamp-defined genres ("Rock", e.g.); if GENRE doesn’t exactly match case-insensitively one of the Winamp genres, it it will be matched to the official list by minimal Damerau-Levenshtein distance (again without regard to case). The genre field for ID3v1 tags will be the corresponding numeric value
See also --list-winamp-genres.
-G GENRE|--Genre=GENRE specify GENRE as the verbatim text to be used for ID3v2 TCON frames (i.e. no matching to the Winamp-defined list will be done). The value for the genre field in ID3v1 tags will however be determined by the closest match to the Winamp-defined list. See also --v1 below for how to turn off that behavior.
-W,--list-winamp-genres list the Winamp-defined genres, piped through your pager if scribbu can determine that (the environment variables SCRIBBU_PAGER & then PAGER are checked first, then any program named less on PATH will be used. See also --no-pager.
-P,--no-pager do not use any pager when printing the Winamp genre list; just print to stdout
-t INDEX,--tag=INDEX specify a zero-based index describing which ID3v2 tag to alter, in the case of multiple ID3v2 tags in a given file; may be given more than once to select multiple tags– if not given, all tags present will be modified
-v N,--v1=N numeric genre to use for ID3v1 tags when G is given
-1,--v1-only only update ID3v1 tags; ignore any ID3v2 tags found
-2,--v2-only only update ID3v2 tags; ignore any ID3v1 tags found.
-w N,--winamp=N specify the genre numerically in terms of the 192 Winamp-defined genres.

Next: Invoking scribbu report, Previous: Invoking scribbu genre, Up: The scribbu Program [Contents][Index]

2.5 Invoking `scribbu m3u` ¶

scribbu m3u will walk each file and/or directory specified (recursing directories) and print extended M3U entries for each. By default, it will print EXTM3U entries to stdout for all files named (directly or indirectly) on the command line. If an argument is a directory, this command will operate recursively on all files therein (the order of traversal is unspecified).

Each extended M3U entry takes the form:

    # EXINF:<duration-in-seconds>,<display title>
    <path-to-file>

The display title will be "Artist - Title" if those two items can be derived from ID3 tags; otherwise the file basename will be used. The text forming the artist & title tags will be assumed to be in the system locale’s encoding. To override this, specify the -s flag (for source encoding).

The entry’s path will be relative or absolute, according to the argument (i.e. specifying an absolute path to a file or directory will produce absolute paths, a relative path to a file or directory relative paths in the output).

When writing entries to stdout, all text will be written in the system locale’s text encoding.

If the -o option is given, the output will be written to the file named as the option’s value. By convention, files ending in .m3u8 are UTF-8 encoded and files ending in .m3u are written in an unspecified encoding. Given that M3U is a de facto standard, scribbu does not enforce this (or any other naming convention).

In this case, a new file will be created with the #EXTM3U header line, unless the -a (append) flag is given, in which case the output will be appended to the (presumably existing) file.

By default, output will be in the system locale’s text encoding. To force UTF-8 output, specify the -8 option.

So, for instance, if your system locale’s encoding is ISO-8859-1, and your tags are written in, say, Windows Code Page 1251, but you would like an M3U playlist in UTF-8 format, say:

scribbu m3u -s CP1251 -o test.m3u8 -8 some-directory/

scribbu m3u Options

Up: Invoking scribbu m3u [Contents][Index]

2.5.1 `scribbu m3u` Options ¶

-v|--verbose Produce more verbose output; this really only makes sense when printing entries to file (otherwise the informational messages will be intermingled with the entries printed to stdout.
-s ARG|--source-encoding=ARG Specify the text encoding in which the textual tags author & title are written in the files to be processed; e.g. "ASCII" or "UTF-8". Say scribbu encodings for a list of names for all support ed encodings.
-o ARG|--output=ARG This option indicates that the EXTM3U entries shall be written to the given file rather than stdout.
-a|--append When writing to file, this option indicates that the named file should be appended to rather than overwritten.
-8|use-utf-8 When writing to file, this option indicates that the output shall be encoded as UTF-8 (rather than the system locale’s text encoding, which is the default behavior).
-e|--on-encoding-failure Specify handling for encoding failures; may be one of fail (the default). transliterate or ignore

Next: Invoking scribbu rename, Previous: Invoking scribbu m3u, Up: The scribbu Program [Contents][Index]

2.6 Invoking `scribbu report` ¶

scribbu report will walk each file and/or directory specfied (recursing directories), read each file found, look for ID3 tags therein, and generate a report on their contents. The idea is to use scribbu to do the work of scanning the tags in combination with some other tool better suited to querying & reporting. Consequently, filtering mechanisms are minimal, and the output formats (CSV or TDF) are chosen to facilitate transformation to other formats as well as import by other tools.

scribbu report Options

Up: Invoking scribbu report [Contents][Index]

2.6.1 `scribbu report` Options ¶

-c ARG|--num-comments=ARG number of comments to be reported; the total number of comment frames is always reported, but this governs the number of comment frames (owner, text &c) to be included in the report (default six)
-o ARG|--output=ARG the file to which the report shall be written
-1 ENC|--v1-encoding=ENC specify the encoding See Character Encodings. to be used to read ID3v1 tags (defaults to CP1252
-t|--tsv-format select tab-delimited format instead of comma-separated values (the default).
-a|--ascii-delimited if using tab-delimited format, output ASCII-delimited text by using 0x1f (the ASCII unit separater) to delimit fields rather than TABs.

Next: Invoking scribbu popm, Previous: Invoking scribbu report, Up: The scribbu Program [Contents][Index]

2.7 Invoking `scribbu rename` ¶

scribbu rename will walk each file and/or directory specified (recursively, in the case of directories) and rename each ID3-tagged file found according to its tag(s). By default, each ID3-tagged file will be renamed to “<artist> - <title>.<extension>” (where <artist> & <title> are derived from the file’s ID3 tags), but this can be heavily customized by specifying a naming “template” made up of a mixture of text and replacement parameters (such as artist, title, album &c).

Replacement parameters begin with a % character (percent characters that do not begin a replacement parameter may be escaped with a backslash). Each replacement parameter has a one-character “short” form as well as a “long-form” name. For example, the artist replacement can be represented as either %A or as %(artist).

When the long form is used, the action of replacement may optionally be modified by giving options after a colon. The options take the form opt0&opt1&opt2&... where opti is of the form name=value, or just name. So to continue the above example, if we wanted the artist name to instead be derived from the ID3v1 tag, and that field was encoded as ISO-8859-1, we would say:

%(artist:v1-only&v1-encoding=iso-8859-1)

See Tag-Based Replacements. for a complete list of replacement parameters & their options.

scribbu rename Options
Tag-Based Replacements
File-based Replacements

Next: Tag-Based Replacements, Up: Invoking scribbu rename [Contents][Index]

2.7.1 `scribbu rename` Options ¶

-h,--help Display help & exit
-n,--dry-run Dry-run; only print what would happen
-o ARG,--output=ARG, If specified, copy the output files to this directory, rather than renaming in-place.
-r,--rename Remove the source file (ignored if --dry-run is given)
-t TEMPLATE,--template=TEMPLATE The template by which to rename ID3-tagged files in the arguments (defaults to “%A - %T.mp3”
-v,--verbose Produce verbose output.

Next: File-based Replacements, Previous: scribbu rename Options, Up: Invoking scribbu rename [Contents][Index]

2.7.2 Tag-Based Replacements ¶

Tag-based replacement parameters:

Content	Short-Form	Long-Form
album	L	albim
artist	A	artist
content type	G	content-type,genre
encoded by	e	encoded-by
title	T	title
year	Y	year

Tag-based replacement parameters take the following options:

Source of the replacement text:
- prefer-v2
- prefer-v1
- v2-only
- v1-only
character encoding when the ID3v1 tag is used: v1-encoding=...
- auto
- iso-8859-1
- ascii
- cp1252
- utf-8
- utf-16-be
- utf-16-le
- utf-32
Handling “The...”: the=...
- suffix (i.e. “The Pogues” will be changed to “Pogues, The”)
- prefix
capitalization: cap=...
- all-upper
- all-lower
handling whitespace: either compress can be given (to merge space between words to a single space) or ws=TEXT can be given to replace whitespace (e.g. if ws=_ were given, “a b” would become “a_b”.

Lastly, the year can be formatted as two digits or four by giving “yy” or “yyyy” in the options for %(year).

E.g. %(artist:prefer-v2&v1-encoding=cp1252&the=suffix&compress) applied to a file whose ID3v2 tag had an artist frame of "The Pogues" would produce "Pogues, The".

Previous: Tag-Based Replacements, Up: Invoking scribbu rename [Contents][Index]

2.7.3 File-based Replacements ¶

There are a few more replacement parameters based on the file itself:

b,basename: The file basename
E,extension: The file extension (including the dot)

Both of these take the same “The,”, capitalization & whitespace options as Tag-Based Replacements.

5,md5: the MD5 checksum of the file’s audio data
S,size: the file size, in bytes

Both of these take the following options:

base=(decimal|hex): specify the radix for the numbers
hex-case=(U|L): case to use for hexidecimal numbers

Next: Invoking scribbu text, Previous: Invoking scribbu rename, Up: The scribbu Program [Contents][Index]

2.8 Invoking `scribbu popm` ¶

scribbu popm creates or updates play count and/or popularimeter frames. With no options, it increments the counter fields in every play count and/or popularimeter frame in every tag by one. With the --create-frame flag, create the relevant frames in each tag. Popularimeter frames will not be created in the absence of the --owner option. Play count & popularimeter frame creation can be inhibited via the --popularimeter-only and --playcount-only flags, respectively.

The popularimeter rating field can be set using the --rating option. Ratings can be specified explicitly as an integer between 0 & 255, or as one-to-five stars. “Stars” would most naturally be expressed as *s (asterisks), but since this will often be inconvenient in the shell, scribbu will accept almost any character, repeated one-to-five times.

scribbu popm Options

Up: Invoking scribbu popm [Contents][Index]

2.8.1 `scribbu popm` Options ¶

-h,--help display help & exit with status zero
-n,--dry-run don’t modify the files named in the arguments; just print what would have been done
-f,--create-frame create playcount and/or popularimeter in any tags that are missing. This can be modified by the --popularimeter-only and --playcount-only flags, respectively. Popularimeter frames will only be created if the --owner flag is given, as well.
-c|--create-v2 create an ID3v2 tag with a POPM and/or PCNT frames for any file that has an ID3v1 tag but does not have an ID3v2 tag; may not be combined with --always-create-v2
-a|--always-create-v2 always create an ID3v2 tag with a POPM and/or PCNT frame for any file that does not possess one (whether or not there’s an ID3v1 tag present); may not be combined with --create-v2.
-b,--create-backups by default, the new tagset will be written in-place (emplacing, if possible); this option will cause a backup file to be made first.
-C COUNT,--count=COUNT set all counter fields to COUNT instead of incrementing
-i INCR,--increment=INCR increment all counter fields by INCR, instead of by one.
-o OWNER,--owner=OWNER Specify the owner field for popularimeter frames. If incrementing count fields, only popularimeter frames with an owner of OWNER will be updated. When creating popularimeter frames, the owner field will be set to OWNER.
-p,--playcount-only if present, this switch will limit operations to playcount frames only
-m,--popularimeter-only if present, this switch will limit operaitons to popularimeter frames only
-r RATING,--rating=RATING specify the rating for use in popularimeter tags. RATING may be given either as an integer between 0 & 255 (inclusive) or as one-to-five “stars”, given as [a-zA-Z@#%*+]{1,5} e.g. three stars could be expressed as “xxx” or “###” or “***”.
-t INDEX,--tag=INDEX specify a zero-based index describing which tag to alter, in case of multiple ID3v2 tags in a single file. This option may be given more than once to indicate multiple tags. If not given, all tags will be modified.
-u,--adjust-unsync adjust each tag’s use of the unysnchronisation scheme on write (by default, it’s never used)

Next: Invoking scribbu xtag, Previous: Invoking scribbu popm, Up: The scribbu Program [Contents][Index]

2.9 Invoking `scribbu text` ¶

scribbu text will create, udpate & delete various ID3v2 text frames & ID3v1 tag fields.

scribbu text Options

Up: Invoking scribbu text [Contents][Index]

2.9.1 `scribbu text` Options ¶

-c|--create-v2 create an ID3v2 tag with a POPM and/or PCNT frames for any file that has an ID3v1 tag but does not have an ID3v2 tag; may not be combined with --always-create-v2
-a|--always-create-v2 always create an ID3v2 tag with a POPM and/or PCNT frame for any file that does not possess one (whether or not there’s an ID3v1 tag present); may not be combined with --create-v2.
-a ALBUM,--album=ALBUM set the TALB, or Album/Movie/Show Title frame
-A ARTIST,--artist=ARTIST Set the TPE1, or Lead artist(s)/Lead performer(s)/Soloist(s)/Performing group frame
-e ENC,--encoded-by=ENC Set the TENC, or Encoded By frame
-g GENRE,--genre=GENRE Set the TCON, or Content time frame
-T TITLE,--title=TITLE Set the TIT2, or Title/Songname/Content description frame
-k TRACK,--track=TRACK Set the TRCK, or Track number/Position in set frame
-y YEAR,--year=YEAR Set the TYER, or Year frame
-d FRAME,--delete=FRAME Specify a frame to remove, if present; this option may be given more than once to delete multiple frames. Frames may be named by either their option name (e.g. ‘artist’) or by their ID3v2.3 frame ID (e.g. TPE1).
-E ENC,--encoding=ENC Specify the character encoding used in the input strings using the iconv name (‘ISO-8859-1’, e.g.) If not given, the system locale will be assumed (See Character Encodings. for a complete list of the encodings supported and their identifiers).
-t INDEX,--tag=INDEX Zero-based index of the tag on which to operate; may be given more than once to select multiple tags
-u,--adjust-unsync Update the unsynchronisation flag as needed on write (default is to never use it).
-b,--create-backups Create backup copies of all files before modifying them.

Previous: Invoking scribbu text, Up: The scribbu Program [Contents][Index]

2.10 Invoking `scribbu xtag` ¶

The author has defined a new ID3v2 frame representing a tag cloud. Tags may be have zero or more values associated with them, e.g. “hopeful” (zero values), or “decade=90s” (one value) or “sub-genres=smooth-jazz,bossa-nova” (two values) & so on.

The tag identifier is XTAG and a given ID3v2 tag may have multiple XTAG frames, each distinguished by a different owner– a null-terminated string with a URL containing an email address, or a link to a location where an email address can be found, that belongs to the organisation responsible for the frame.

scribbu xtag will create, udpate & delete an experimental tag cloud frame

scribbu xtag Options

Up: Invoking scribbu xtag [Contents][Index]

2.10.1 `scribbu xtag` Options ¶

-u,--adjust-unsync Update the unsynchronisation flag as needed on write (default is to never use it).
-b,--create-backups Create backup copies of all files before modifying them.
-c,--create-v2 Create an ID3v2 tag with a XTAG frame for any file that has an ID3v1 tag but does not have an ID3v2 tag (fields from the ID3v1 tag will be copied over).
-C,--always-create-v2 Always create an ID3v2 tag with the XTAG frame set appropriately for any file that does not possess one.
-f,--create-frame Create a new XTAG frame if not present.
-m,--merge Merge the given tags, don’t overwrite
-n,--dry-run Don’t do anything; just print what would be done.
-g,--get Print the existing tag cloud, if any; don’t set or update
-o OWNER,--owner=OWNER Operate only on XTAG frames with this owner, or specify the owner in case an XTAG frame is being created.
-t INDEX,--tag=INDEX Zero-based index of the tag on which to operate; may be given more than once to select multiple tags.
-T TAG-CLOUD,--xtags=TAG-CLOUD Tags to be set or merged expressed in HTTP query parameter style using URL-encoding, e.g. “foo&bar=has%20%2c&splat=a,b,c”

Next: Using libscribbu, Previous: The scribbu Program, Up: scribbu [Contents][Index]

3 Scripting `scribbu` ¶

The set of sub-commands sribbu offers, or could offer, is small in comparison to the number of operations one could possibly hope to carry out in managing ID3 tags. Sooner or later (likely sooner) you will want to do something you can’t accomplish via a sub-command.

For that reason, the bulk of the work on scribbu has been exposing the library’s functionality to a first-class language like LISP (see The Guile Reference Manual), to enable scribbu users to build their own solutions.

Worked Example
ID3v1 tags
ID3v2 tags
Text Encoding

Next: ID3v1 tags, Up: Scripting scribbu [Contents][Index]

3.1 Worked Example ¶

This chapter begins by demonstrating how to use the interactive Scheme REPL to explore solutions, then demonstrates building Scheme programs using scribbu, and finishes with some references.

The Scheme REPL
Writing Scheme Programs with scribbu
Getting More Information

Next: Writing Scheme Programs with scribbu, Up: Worked Example [Contents][Index]

3.1.1 The Scheme REPL ¶

At the end of “scribbu rename” (see scribbu rename) there were a number of tag hygiene issues to be cleaned up. Let us begin experimenting with solutions. Invoking scribbu with no arguments at all will start the Scheme REPL:

$>: scribbu
scribbu 0.6.23
Copyright (C) 2017-2022 Michael Herstine <sp1ff@pobox.com>

You are in the Guile REPL; in your shell, type `info scribbu' for documentation.

GNU Guile 3.0.9
Copyright (C) 1995-2023 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)>

You are now at the Scheme prompt (“scheme” refers to the language currently in use and “guile-user” refers to the current module). You can type Scheme expressions & have your them evaluated:

scheme@(guile-user)> (format #t "Hello, world!")
Hello, world!$1 = #t
scheme@(guile-user)> (define x 1)
scheme@(guile-user)> (set! x (+ x 1))
scheme@(guile-user)> x
$2 = 2
scheme@(guile-user)> (if (> x 1) (format #t "Yes!\n"))
Yes!
$3 = #
scheme@(guile-user)>

scribbu exports assorted types & functions for working with ID3 tags to the Guile interpreter. Let’s take a look at that ownerless comment frame. We begin by reading in the ID3v2 tagset:

scheme@(guile-user)> (use-modules (oop goops) (scribbu))
scheme@(guile-user)> (define tags (read-tagset "opium.mp3"))
scheme@(guile-user)> tags
$4 = ((#<<id3v2-tag> 1ccf780> 3 #f))

read-tagset returns a list of three-tuples, one for each ID3v2 tag present in its argument. Since “opium.mp3” has only one ID3v2 tag, the list has only one element.

Function: read-tagset file ¶: Read all ID3v2 tags from the beginning of file. Return a list of three-tuples, one for each tag. Each three tuple consists of an <id3v2-tag> instance, the ID3v2 version (“3” in this case) and a boolean indicating whether the unsynchronisation flag is set.

Let’s examine the tag:

scheme@(guile-user)> (define tag (caar tags))
scheme@(guile-user)> tag
$5 = #<<id3v2-tag> 1ccf780>
scheme@(guile-user)> (let ((frames (slot-ref tag 'frames)) (i 0)) (while (> (length frames) 0) (format #t "~d: ~a\n" i (slot-ref (car frames) 'id)) (set! i (+ i 1)) (set! frames (cdr frames))))
0: encoded-by-frame
1: track-frame
2: comment-frame
3: publisher-frame
4: part-of-a-set-frame
5: year-frame
6: genre-frame
7: album-frame
8: band-frame
9: artist-frame
10: unknown-frame
11: title-frame
12: play-count-frame
13: pop-frame
$6 = #f

We see that tag is an instance of the GOOPS class <id3v2-tag>, and that it has 14 frames. Frame two (counting from zero) is that comment frame:

scheme@(guile-user)> (slot-ref (list-ref (slot-ref tag 'frames) 2) 'dsc)
$7 = ""

As expected, the description field is an empty string– let’s fix that:

scheme@(guile-user)> (slot-set! (list-ref (slot-ref tag 'frames) 2) 'dsc "sp1ff@pobox.com")
$8 = "sp1ff@pobox.com"
# check
scheme@(guile-user)> (slot-ref (list-ref (slot-ref tag 'frames) 2) 'dsc)
$9 = "sp1ff@pobox.com"

Now what about that ID3v1 genre?

scheme@(guile-user)> (define v1 (read-id3v1-tag "opium.mp3"))
scheme@(guile-user)> v1
$10 = #<<id3v1-tag> 18fb8c0>
scheme@(guile-user)> (slot-ref v1 'genre)
$11 = 255

Function: read-id3v1-tag file ¶: Reads the ID3v1 tag, if any, from file.

Let’s set that to “Lounge”– the Winamp genre list sets that to 171:

scheme@(guile-user)> (slot-set! v1 'genre 171)
$12 = 171

What remains is writing out our modifications to their respective tags. We could do this directly in the REPL, but let’s capture our work in the form of a program.

Next: Getting More Information, Previous: The Scheme REPL, Up: Worked Example [Contents][Index]

3.1.2 Writing Scheme Programs with `scribbu` ¶

scribbu understands both its own command-line parameters as well as those understood by the guile command. When it sees parameters applicable to guile, it will collect them and pass them on to the Scheme interpreter (when this makes sense, of course; supplying guile options while invoking a scribbu sub-command, for instance, would make no sense & results in an error). This means that scribbu can take advantage of the guile scripting options (see Guile Scripting in The Guile Reference Manual).

Continuing our example, let us capture our work so far:

#!/usr/local/bin/scribbu -e main -s
#!
(use-modules (oop goops) (scribbu))

(define (main)
    (let* ((tags (read-tagset "opium.mp3"))
           (v1   (read-id3v1-tag "opium.mp3"))
           (tag  (caar tags)))
        (slot-set! (list-ref (slot-ref tag 'frames) 2) 'dsc "sp1ff@pobox.com")
        (slot-set! v1 'genre 171)))

This Scheme program of course does nothing; it corrects the orphaned comment frame as well as the ID3v1 genre, but only in-memory. Let us write these out to disk. Writing out the ID3v1 is simpler since it’s a fixed size, so we’ll start with that:

#!/usr/local/bin/scribbu -e main -s
#!
(use-modules (oop goops) (scribbu))

(define (main)
    (let* ((tags (read-tagset "opium.mp3"))
           (v1   (read-id3v1-tag "opium.mp3"))
           (tag  (caar tags)))
        (slot-set! (list-ref (slot-ref tag 'frames) 2) 'dsc "sp1ff@pobox.com")
        (slot-set! v1 'genre 171)
        (write-id3v1-tag v1 "optimum.mp3")))

Writing an ID3v1 tag is also easier because it is appended to the file.

Function: write-id3v1-tag tag file ¶: Write the ID3v1 tag tag to file, overwriting any (ID3v1) tag that was there previously. Nb. tag may be written as an ID3v1, ID3v1.1 and/or an ID3v1 enhanced tag, depending on the precise contents of tag.

Writing ID3v2 tagsets is more complicated, since their size can vary. write-tagset can either make a wholesale copy of the file, or attempt to emplace the new tagset at the beginning of the extant file (which is the default):

#!/usr/local/bin/scribbu -e main -s
#!
(use-modules (oop goops) (scribbu))

(define (main)
    (let* ((tags (read-tagset "opium.mp3"))
           (v1   (read-id3v1-tag "opium.mp3"))
           (tag  (caar tags)))
        (slot-set! (list-ref (slot-ref tag 'frames) 2) 'dsc "sp1ff@pobox.com")
        (slot-set! v1 'genre 171)
        (write-id3v1-tag v1 "optimum.mp3")
        (write-tagset (list (list tag 3)) "opium.mp3")))

Function: write-tagset tagset file #:copy copy #:apply-unsync apply-unsync ¶

Write a list of <id3v2-tag> instances to file. tagset is a list of two-element lists; the first in each is the <id3v2-tag> instance to be written, the second is an integer designating the ID3v2 version to use (i.e. 2, 3 or 4)

If keyword argument copy is #t, first write the new tagset to a new file, then append file’s contents after the existing tagset (if any) to that new file, and then rename the new file over the original (making a backup copy first). Otherwise (if copy is #f, attempt to emplace the new tagset, perhaps by adjusting padding, if possible (and fallback to copying).

Keyword argument apply-unsync controls whether the unsynchronisation scheme is applied to each tag. Set this to #f (the default) to never do so, #t to always do so, and as-needed to do so only if the tag needs unsynchronisation.

Previous: Writing Scheme Programs with scribbu, Up: Worked Example [Contents][Index]

3.1.3 Getting More Information ¶

References:

See Introduction in The Guile Reference Manual.
Scheme versus Common Lisp https://www.cs.utexas.edu/~novak/schemevscl.html
Schemers.org https://schemers.org

Next: ID3v2 tags, Previous: Worked Example, Up: Scripting scribbu [Contents][Index]

3.2 ID3v1 tags ¶

The orginal ID3v1 tag contained title, artist, album, year, comment & genre. The fields are fixed-size (30, 30, 30, 4, 30 & 1 byte, respectively). The original proposal called for filling out the fields with nil (zero) values, but that is not universally implemented (Winamp, for instance, pads fields out with ASCII spaces (i.e. 32 = 0x20)).

Michael Mutschier observed that if the fields were zero-padded, an implementation will likely stop on reading the first nil. Therefore, if the second-to-last byte of a field is nil, a one-byte value could be stored in the last field. He proposed storing the track number in the last byte of the comment field. This became known as ID3v1.1.

A thirty-byte limit quickly became constraining, leading to the ID3v1 “enhanced” specification. The origins of the proposal are unclear to me, but the proposal itself involves prepending a second two-hundred twenty-seven byte block to the ID3v1 block. This would extend the title, artist & album fields by sixty bytes each, adds a thirty-byte free-form genre field, and introduces start-time, end-time, and “speed” fields.

scribbu represents the ID3v1 tag by the GOOPS class <id3v1-tag>:

(define-class <id3v1-tag> ()
  (title      #:init-value ""  #:accessor title       #:init-keyword #:title)
  (artist     #:init-value ""  #:accessor artist      #:init-keyword #:artist)
  (album      #:init-value ""  #:accessor album       #:init-keyword #:album)
  (year       #:init-value '() #:accessor year        #:init-keyword #:year)
  (comment    #:init-value ""  #:accessor comment     #:init-keyword #:comment)
  (genre      #:init-value 255 #:accessor genre       #:init-keyword #:genre)
  (track-no   #:init-value '() #:accessor track-no    #:init-keyword #:track-no)
  (enh-genre  #:init-value '() #:accessor enh-genre  #:init-keyword #:enh-genre)
  (speed      #:init-value '() #:accessor speed       #:init-keyword #:speed)
  (start-time #:init-value '() #:accessor start-time #:init-keyword #:start-time)
  (end-time   #:init-value '() #:accessor end-time   #:init-keyword #:end-time))

The class’ fields include the union of all ID3v1, ID3v1.1 and ID3v1 enhanced fields. All fields above & beyond those present in ID3v1 however, have a default alue of '() (or nil, in Scheme). Whether a given <id3v1-tag> instance is ID3v1, ID3v1.1, and/or ID3v1 enchanced is implicitly determined by whether any of these fields are non-nil.

One can create an <id3v1-tag> instance directly, like any GOOPS class:

(use-modules (oop goops))
(define tag (make <id31-tag> #:title  "The Body of an American"
                             #:artist "Pogues, The"
                             #:album  "Poguetry in Motion"
                             #:year   "1986"
                             #:genre  88))

One can also create an instance from an existing tag on disk:

(use-modules (scribbu))
(define tag (read-id3v1-tag "foo.mp3"))
(format #t "~s - ~s\n" (slot-ref tag #:artist) (slot-ref tag #:title))

<id3v1-tag> instances can be written to disk via write-id3v1-tag: (write-id3v1-tag tag "bar.mp3"). If any of the title, artist or album slotes are longer than thirty characters, or any of the new fields (enhanced genreo, speed, start-time or end-time) are non-nil, it will be written as an ID3v1 enhanced tag.

Next: Text Encoding, Previous: ID3v1 tags, Up: Scripting scribbu [Contents][Index]

3.3 ID3v2 tags ¶

The various flavors of ID3v1 tags had obvious limitations, leading to the introduction in 1998 of ID3v2 (by Martin Nilsson, Michael Mutschler et al.). Despite the name, this format has nothing to do with ID3v1. ID3v2 tags are much more complex. The tags are pre-pended to the files they describe. They are comprised of one or more frames, each of which contains one piece of information. There is provision for padding appended to the tag, to permit subsequent augmentation of the tag without having to re-write the entire file. ID3v1 tags suffered from the fact that they encoded text as ASCII (ISO-8859-1, at most): ID3v2 carried the encoding scheme along with textual information.

Furthermore, there are three versions of the ID3v2 spec that saw general use:

ID3v2.2 was the first public version; it used three-character frame identifiers instead of four; it is generally considered obsolete.
ID3v2.3 introduced four-character frame identifiers as well as adding a number of new frames along with a second, extended header. This is the version most frequently encountered in the wild.
ID3v2.4 was the last version published, but never saw widespread adoption.

The Unsynchronisation Scheme
ID3v2 Frames
<id3v2-tag>

Next: ID3v2 Frames, Up: ID3v2 tags [Contents][Index]

3.3.1 The Unsynchronisation Scheme ¶

MPEG decoding software uses a two-byte sentinel value in the input stream to detect the beginning of the audio. MPEG decoding software that is not ID3-aware could mistakenly interpret that value as the beginning of the audio should it happen to occur in an ID3v2 tag. Unsynchronisation is an optional encoding scheme for the ID3v2 tag to prevent that. "Unsynchronisation may only be made with MPEG 2 layer I, II and III and MPEG 2.5 files" (http://id3.org/id3v2-00).

More specifically, whenever a two byte combination of the form:

11111111 111xxxxx

(i.e. 0xFF 0xEx or 0xFF 0xFx) is encountered in an ID3v2 tag to be written to disk, it is replaced with:

11111111 00000000 111xxxxx

and the unsynchronisation flag will be set.

This leaves us with an ambiguous situation on read: if we encounter a bit pattern

11111111 00000000 111xxxxx

when reading a tag with the unsynchronisation flag set, we have no way to know whether that was a false sync that was unsynchronised (and so the three bytes should be interpreted as 11111111 111xxxxx or whether those three bytes had occurred naturally in the tag when it was written. To resolve this, on applying unsynchronisation all two-byte sequences of the form $FF 00 should also be written as $FF 00 00.

ID3v2.4 introduced unsynchronisation at a frame level; the unsynchronisation flag in the header being set indicates that all frames are unsynchronised; unset in the header means that at least one frame is *not* unsynchronised.

Note that since the point of unsynchronisation is to avoid presenting a false sync point to the MPEG decoding software, unsynchronisation should be employed last, after any compression or encryption.

Next: <id3v2-tag>, Previous: The Unsynchronisation Scheme, Up: ID3v2 tags [Contents][Index]

3.3.2 ID3v2 Frames ¶

All ID3v2 frames subclass GOOPS class <id3v2-frame>:

(define-class <id3v2-frame> ()
  (id     #:init-value 'unknown-frame #:accessor id     #:init-keyword #:id)
  (tap    #:init-value '()            #:accessor tap    #:init-keyword #:tap)
  (fap    #:init-value '()            #:accessor fap    #:init-keyword #:fap)
  (ro     #:init-value '()            #:accessor ro     #:init-keyword #:ro)
  (unsync #:init-value '()            #:accessor unsync #:init-keyword #:unsync))

id is a symbol naming the frame (see Frame Identifiers). The remaining four fields are frame flags that can be either true (#t), false (#f) or just left undefined ('()):

tap Tag Alter Preserve “This flag tells the software what to do with this frame if it is unknown and the tag is altered in any way. This applies to all kinds of alterations, including adding more padding and reordering the frames.” Sec 3.3.1
fap File Alter Preserve “This flag tells the software what to do with this frame if it is unknown and the file, excluding the tag, is altered. This does not apply when the audio is completely replaced with other audio data.” Sec 3.3.1
ro Read Only “This flag, if set, tells the software that the contents of this frame is intended to be read only. Changing the contents might break something, e.g. a signature. If the contents are changed, without knowledge in why the frame was flagged read only and without taking the proper means to compensate, e.g. recalculating the signature, the bit should be cleared.” Sec 3.3.1
unsync Unsynchronisation In ID3v2.2 & ID3v2.3, a value of #t for this flag indicates that the unsynchronisation scheme See The Unsynchronisation Scheme, has been applied to this tag. In ID3v2.4, it indicates that it has been applied to all frames.

Module scribbu defines a few <id3v2-frame> sub-classes.

<text-frame>
<comment-frame>
<user-defined-text-frame>
<play-count-frame>
<popm-frame>
<tag-cloud-frame>
<unk-frame>

Next: <comment-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.1 `<text-frame>` ¶

A great many ID3v2 frames represent textual information (title, artist &c) and are represented in a uniform way, distinguished only by frame identifer. scribbu represents such frames as instances of <id3v2-text-frame>:

(define-class <text-frame> (<id3v2-frame>)
  (text #:init-value "" #:accessor text #:init-keyword #:text))

Next: <user-defined-text-frame>, Previous: <text-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.2 `<comment-frame>` ¶

<comment-frame> encodes the COM & COMM (comment) frames. #:lang is a three-letter ISO-639-2 language code. The #:dsc fields is described in the specification as a “short content description”.

(define-class <comment-frame> (<id3v2-frame>)
  (lang  #:init-value "eng" #:accessor lang #:init-keyword #:lang)
  (dsc   #:init-value ""    #:accessor dsc  #:init-keyword #:dsc)
  (text  #:init-value ""    #:accessor text #:init-keyword #:text))

Next: <play-count-frame>, Previous: <comment-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.3 `<user-defined-text-frame>` ¶

<user-defined-text-frame> encodes the TXX & TXXX (user-defined text) frames. The #:dsc fields is a description of the textual information & #:text is the information itself. There may be multiple user-defined text frames in a tag, but only one with a given description. Cf. section 4.2.2 of the ID3v2 spec.

(define-class <user-defined-text-frame> (<id3v2-frame>)
  (dsc   #:init-value "" #:accessor dsc  #:init-keyword #:dsc)
  (text  #:init-value "" #:accessor text #:init-keyword #:text))

Next: <popm-frame>, Previous: <user-defined-text-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.4 `<play-count-frame>` ¶

<play-count-frame encodes the CNT & PCNT (play count) frames. The #:count field is simply a counter recording the number of times the file has been played (see scribbu popm). There may be only one <play-count-frame frame in a tag. Cf. section 4.17 of the ID3v2 spec.

(define-class <play-count-frame> (<id3v2-frame>)
  (count #:init-value 0 #:accessor count #:init-keyword #:count))

Next: <tag-cloud-frame>, Previous: <play-count-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.5 `<popm-frame>` ¶

<popm-frame> encodes the POP & POPM (popularimeter) frames. <popm-frame> combines an eight-bit rating field with a <play-count-frame>-style play count. Unlike <play-count-frame>, there may be multiple <popm-frame> frames because each is tagged with the e-mail address of the author.

(define-class <pop-frame> (<id3v2-frame>)
  (e-mail #:init-value "" #:accessor e-mail #:init-keyword #:e-mail)
  (rating #:init-value 0  #:accessor rating #:init-keyword #:rating)
  (count  #:init-value 0  #:accessor count  #:init-keyword #:count))

Next: <unk-frame>, Previous: <popm-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.6 `<tag-cloud-frame>` ¶

<tag-cloud-frame> represents the author’s “tag cloud” (XTAG) frame. Like <popm-frame>, there may be multiple <tag-cloud-frame> frames because each is tagged with the e-mail address of the author. The tag cloud itself (field tags) is represented by an URL-encoded string.

(define-class <tag-cloud-frame> (<id3v2-frame>)
  (owner #:init-value ""  #:accessor owner #:init-keyword #:owner)
  (tags  #:init-value '() #:accessor tags  #:init-keyword #:tags))

Previous: <tag-cloud-frame>, Up: ID3v2 Frames [Contents][Index]

3.3.2.7 `<unk-frame>` ¶

Frames about which scribbu does not know may be encoded as <unk-frame> instances:

(define-class <unk-frame> (<id3v2-frame>)
  (id-text #:init-value ""     #:accessor frameid #:init-keyword #:frameid)
  (data    #:init-value #vu8() #:accessor data    #:init-keyword #:data))

The data field will contain everything beyond the ID3v2 header; i.e. the frame identifier & flags will have been parsed out.

Previous: ID3v2 Frames, Up: ID3v2 tags [Contents][Index]

3.3.3 `<id3v2-tag>` ¶

The scribbu ID3v2 tag abstraction doesn’t try to model the various versions of the ID3v2 spec. Rather, it encodes a “generic” ID3v2 tag; the version to which it shall be serialized is specified at write time, and the version from which it was deserialized is returned at read time (See ID3v2 Serialization.)

(define-class <id3v2-tag> ()
  (experimental #:init-value '() #:accessor experimental
                #:init-keyword experimental)
  (frames       #:init-value '() #:accessor frames  #:init-keyword #:frames)
  (padding      #:init-value   0 #:accessor padding #:init-keyword #:padding))

ID3v2 Serialization
Miscellaneous Functions

Next: Miscellaneous Functions, Up: <id3v2-tag> [Contents][Index]

3.3.3.1 ID3v2 Serialization ¶

While you can of course create an <id3v2-tag> instance “from scratch” (in-memory, as a result of a call to (make <id3v2-tag> ...) you will more frequently be reading them from files on disk.

The function for doing this is read-tagset. The name is intended as a reminder that a file can have multiple ID3v2 tags, so you are in general reading a tag set, not just a tag.

scheme@(guile-user)> (define tags (read-tagset "opium.mp3"))
scheme@(guile-user)> tags
$1 = ((#<<id3v2-tag> 56188c2a3d80> 3 #f))

read-tagset returns a list of three-tuples, one tuple for each tag (so it could return '(), if the file contained no ID3v2 tags). Each three tuple contains:

an <id3v2-tag> instance, representing the tag
the ID3v2 version as which the tag was serialized (i.e. 2, 3 or 4)
a boolean indicating whether the unsynchronisation bit was set (see The Unsynchronisation Scheme).

Once you’ve created or updated your ID3v2 tag(s), you will presumably want to write it (them) to disk, presumably in place of an existing tagset. This is done via write-tagset(tags, file, ...). tags is a list of two-tuples: the first element is always an <id3v2-tag> isntance to be written to disk & the second is the ID3v2 version under which it shall be serialized (i.e. an int, either 2, 3 or 4). file is the file into which the new tagset shall be written, replacing any tagset present therein.

write-tagset takes a few optional parameters:

#:apply-unsync governs whether The Unsynchronisation Scheme, should be applied when writing out the given tags: #f (the default) means never, #t means it will always be applied and 'as-needed means that it will be applied to any tag whose serialization would contain false syncs.
#:copy governs whether a backup copy of the target file will be made: a value of #f (the default) means that the new tagset will be written in place (moving the audio data & ID3v1 tag, if any, if needed) and a value of #t means that the target file will be copied to a backup, the new tagset will be written, and then the track data & ID3v1 tag (if any) will be copied over to the new file.

Previous: ID3v2 Serialization, Up: <id3v2-tag> [Contents][Index]

3.3.3.2 Miscellaneous Functions ¶

Module (scribbu) provides a few other functions that can be useful for working with ID3v2 tags & frames.

Function: with-track-in directory fn ¶

with-track-in(directory, fn) is a convenience function; it will iterate over all filesystem entities in directory and apply fn to them. fn shall be a function taking three parameters:

a tagset, such as what is returned from read-tagset
a string naming the fileystem entity
an ID3v1 tag (nil if none exists)

Example:

scheme@(guile-user)> (with-track-in "." (lambda (tags pth v1) (format #t "~s has ~d ID3v2 tags\n" pth (length tags))))
"./track.dat" has 0 ID3v2 tags
"./id3v22-tda.mp3" has 1 ID3v2 tags
...

Function: has-frame? tag id ¶: Return #t if tag has a frame with identifier id.

Function: get-frames tag id ¶: Returns a (possibly empty) list of frames in tag with identifier id.

Previous: ID3v2 tags, Up: Scripting scribbu [Contents][Index]

3.4 Text Encoding ¶

The various string fields bring up the question: what text encoding is used? There are actually three text encodings in play:

the encoding in use in your Scheme source files
the encoding in use within the Guile interpreter
the encoding in use in libscribbu

The first is documented in the Guile manual under Character Encoding of Soruce Files in The Guile Reference Manual. The upshot is this: UTF-8 is assumed, but the author may tell Guile what is being used through a coding hint:

;;; coding: iso-8859-1

The set of encodings recognized is defined by IANA in RFC2978.

The second is also documented in the Guile manual, under String Internals in The Guile Reference Manual:

Guile stores each string in memory as a contiguous array of Unicode code points along with an associated set of attributes. If all of the code points of a string have an integer range between 0 and 255 inclusive, the code point array is stored as one byte per code point: it is stored as an ISO-8859-1 (aka Latin-1) string. If any of the code points of the string has an integer value greater that 255, the code point array is stored as four bytes per code point: it is stored as a UTF-32 string.

Conversion between the one-byte-per-code-point and four-bytes-per-code-point representations happens automatically as necessary.

That just leaves libscribbu. On read (that is, when the library reads text from tags on disk), the encoding is sometimes specified by the tag itself, or is specified by the caller, or is guessed. From there, it will be converted to a Guile string. On write, text will be converted from the internal Guile representation to the desired text encoding on disk (deduced from either caller preferences or the frame settings themselves).

Next: Frame Identifiers, Previous: Scripting scribbu, Up: scribbu [Contents][Index]

4 Using `libscribbu` ¶

The third way in which to use scribbu is to link against the library libscribbu. Detailed documentation can be found in the libscribbu source itself (Doxygen documentation can be produced by doing cd doc && make doxygen-doc).

While detailed documentation on individual classes, free functions, and sub-systems may one day make it’s way into this manual, for now this chapter will describe using the library through a worked example. This example can be found in the examples/az-tags sub-directory of the scribbu source distribution.

Building a libscribbu program

Up: Using libscribbu [Contents][Index]

4.1 Building a `libscribbu` program ¶

Let us write a small C++ program using libscribbu to clean-up Amazon.com Song IDs. When downloading .mp3s from Amazon.com, their ID3v2 tags contain non-compliant comment frames (in that they have no description). They also try to cram it into the comment field in the ID3v1 tag, even though it’s generally too small to contain the entire string. We will call this program az-tags.

Implementing az-tags
building az-tags

Next: building az-tags, Up: Building a libscribbu program [Contents][Index]

4.1.1 Implementing `az-tags` ¶

The complete source for the program can be found in examples/az-tags/main.cc in the source distribution. The logic is simple enough to fit completely in main. The usage is:

az-tags [-h] [-v] file [file...]

Skipping command line parsing, we begin by initializing the library:

// ...
#include <scribbu/scribbu.hh>
// ...
int
main(int argc, char * argv[])
{
    // Parse command-line options...
    scribbu::static_initialize()

libscribbu needs to carry out assorted initialization; rather than deal with the static initialization problem, it just depends on the caller to explicitly initialize the library.

At this point, the first filename is waiting in argv[optind], so we can set the basic structure of the program:

    for (int i = optind; i < argc; ++i) {
        // ...
    }

For each file, we will open it & parse it into its ID3v2 tags, track data and ID3v1 tag:

  for (int i = optind; i < argc; ++i) {

    fs::ifstream ifs(argv[i], ios_base::binary); // 1

    vector<unique_ptr<scribbu::id3v2_tag>> id3v2;
    scribbu::read_all_id3v2(ifs, back_inserter(id3v2)); // 2
    scribbu::track_data td((istream&)ifs); // 3
    unique_ptr<scribbu::id3v1_tag> pid3v1 = scribbu::process_id3v1(ifs); // 4

At 1, we open the file, taking care to use binary mode so as to avoid newline translation. At 2 we ask libscribbu to read any and all ID3v2 tags into id3v2. We’ve used a vector here, but we can use any container providing a forward output iterator.

At this point, the file pointer is pointing just past the last ID3v2 tag (if any– there may be none, in which case the file pointer remains at the beginning of the file and id3v2 is empty). The easiest way to consume the track data is to construct a track_data instance with it. This will collect some data about the track and advance the file pointer to the one-past-the-end point.

There may or may not be some kind of ID3v1 tag waiting for us. That is why process_id3v1 returns a unique_ptr– if there is no ID3v1 tag, a null pointer will be returned.

We now have zero or more ID3v2 tags to be processed in id3v2:

    for (auto &ptag: id3v2) {
        // `ptag' is a reference to a unique_ptr<id3v2_tag>
        // how to get at its frames?
    }

It is at this point that the libscribbu API turns out to be less than ergonmic. The issue is that read_all_id3v2 returns the tags typed as pointers to id3v2_tag; this is a base class providing a “generic” interface supported by all ID3v2 tags, but the API for iterating over frames is provided individually by each sub-class (id3v2_2_tag, id3v2_3_tag & id3v2_4_tag).

Perhaps it would be worth it to provide an interface on the base class to do this, but for now, I simply dynamic_cast & dispatch to a template function process_tag:

    for (auto &ptag: id3v2) {
      switch (ptag->version()) {
      case 2: { // ID3v2.2 tag
        scribbu::id3v2_2_tag &p = dynamic_cast<scribbu::id3v2_2_tag&>(*ptag);
        process_tag(p);
        break;
      }
      case 3: { // ID3v2.3 tag
        scribbu::id3v2_3_tag &p = dynamic_cast<scribbu::id3v2_3_tag&>(*ptag);
        process_tag(p);
        break;
      }
      case 4: { // ID3v2.4 tag
        scribbu::id3v2_4_tag &p = dynamic_cast<scribbu::id3v2_4_tag&>(*ptag);
        process_tag(p);
        break;
      }
    default:
      cerr << "Unknown ID3v2 revision " << ptag->version() << endl;
      abort();
      }
    }

The template parameter is the id3v2_tag sub-class. Since there are only three, I can factor out the ID3v2-version-specific logic into a traits class:

template <class tag_type>
void
process_tag(tag_type &T)
{
 ...

  for (auto fp: T) { // 1
    if (traits_type::COMMID == fp->id()) { // 2

      id3v2_frame &F = fp;
      comm_type &C = dynamic_cast<comm_type&>(F); // 3

      string dsc = C.template description<string>();
      if (dsc.empty()) {
        string txt = C.template text<string>();
        if ("Amazon.com Song ID" == txt.substr(0, 18)) {
          cout << "updating the comment frame containing " << txt << endl;
          fp = traits_type::replace(C);
        }
      }
    }
  }
}

Each concreate id3v2_tag subclass implements begin & end, so we can use instances thereof as targets in for range loops like 1. fp is actually a mutable proxy for an ID3v2-version-specific id3v2_frame subclass. At 2 we have factored out the precise frame ID to select for comments frames.

Each ID3v2 version has a concrete comment frame type, to which we again dynamically cast (I really need to re-evaluate this interface) at 3.

The rest of the logic is straightforward– if there is no description field in the comments frame, and the comment text begins with “Amazon.com Song ID”, replace the frame.

Previous: Implementing az-tags, Up: Building a libscribbu program [Contents][Index]

4.1.2 building `az-tags` ¶

The next step is to compile the program. We shall use Autotools, beginning with the simplest configure.ac we can:

AC_PREREQ([2.69])
AC_INIT([az-tags], [0.1], [sp1ff@pobox.com])
AC_CONFIG_MACRO_DIR([macros])
AC_CONFIG_SRCDIR([src/main.cc])
AC_CONFIG_AUX_DIR([build-aux])
AC_CONFIG_HEADERS([config.h])
AM_INIT_AUTOMAKE([-Wall -Werror])
LT_INIT
AC_RROG_CXX
AC_CONFIG_FILES([Makefile src/Makefile])

AC_PREREQ just asserts that Autoconf 2.69 is required to build a configure script from this template. AC_INIT is the Autoconf initialization macro. We’re going to need some custom macros for this project, so AC_CONFIG_MACRO_DIR tells Autoconf where to find them. AC_CONFIG_SRCDIR is just a sanity check– when running configure users will sometimes pass an incorrect value for --srcdir– this macro equips the generated configure script to catch that. AC_CONFIG_AUX_DIR tells Autoconf to place auxilliary scripts (missing & ionstall-sh, e.g.) in a sub-directory named build-aux.

AC_CONFIG_HEADERS tells Autoconf to generate a header file named config.h containing C preprocessor #defines for the project. Note that we need to generate a template file config.h.in via autoheader.

Finally, we initialize Automake, libtool, check for a C++ compiler & produce Makefile templates.

The Autmake template for the root makefile is trivial:

SUBDIRS = src

Let us begin the Makefile template in src:

bin_PROGRAMS = az-tags
az_tags_SOURCES = main.cc
AM_CXXFLAGS = -std=c++17

We will need to perform some one-time setup:

mkdir build-aux
touch NEWS README AUTHORS ChangeLog
autoheader
aclocal
autoconf
automake --add-missing

At this point, we can run ./configure, but make will fail miserably. Our program needs to be able to find scribbu, openssl and boost includes, along with the corresponding libraries. All the required libraries other than libscribbu provide pre-built macros which we can copy from the scribbu source distro into macros. Let us add the following lines to configure.ac, just before the call to AC_CONFIG_FILES:

PKG_CHECK_MODULES([GUILE], [guile-2.2])
AX_BOOST_BASE([1.58], [],
    [AC_MSG_ERROR([Scribbu requires boost_base 1.58 or later.])])
echo "Checkpoint 3: BOOST_LDFLAGS is $BOOST_LDFLAGS;" >&AS_MESSAGE_LOG_FD

AX_BOOST_IOSTREAMS
AX_BOOST_FILESYSTEM
AX_BOOST_SYSTEM
AX_CHECK_OPENSSL([],[AC_MSG_ERROR([Scribbu requires openssl.])])

Each of these will define Automake variables describing where we can find headers & libraries which we can add to src/Makefile.am, which now reads:

bin_PROGRAMS = az-tags
az_tags_SOURCES = main.cc
AM_CPPFLAGS = $(BOOST_CPPFLAGS)
AM_CXXFLAGS = -std=c++17 $(GUILE_CFLAGS)
AM_LDFLAGS = $(BOOST_LDFLAGS)
LDADD = $(GUILE_LIBS)           \
	$(BOOST_SYSTEM_LIB)     \
	$(BOOST_FILESYSTEM_LIB) \
	$(BOOST_IOSTREAMS_LIB)  \
	$(OPENSSL_LIBS)

This just leaves the question of where to find libscribbu. scribbu, at the time of this writing, provides no Autoconf macros (however, this sample provided the author the opportunity to prototype one).

We add the following code to configure.ac, just after the call to AC_PROG_CXX (it’s a lot of code; step-by-step explanation to follow):

AC_ARG_WITH([scribbu],
    AS_HELP_STRING([--with-scribbu=DIR],
                   [root directory of scribbu installation]),
    [
        case "$withval" in
	"" | y | ye | yes | n | no)
	    AC_MSG_ERROR([--with-scribbu takes a root directory]);;
	*)
	    scribbu_dirs="$withval";;
	esac
    ],
    [
        # Just use the defaults
	scribbu_dirs="/usr/local /usr /opt/local /sw"
    ])

dnl One way or another, we have one or more candidates in ${scribbu_dirs}
found=no
for scribbu_home in ${scribbu_dirs}; do
    AC_MSG_CHECKING([for scribbu/scribbu.h under ${scribbu_home}])
    if test -f "${scribbu_home}/include/scribbu/scribbu.hh"; then
        SCRIBBU_INCLUDES="-I${scribbu_home}/include/scribbu"
	SCRIBBU_LDFLAGS="-L${scribbu_home}/lib"
	SCRIBBU_LIBS="-lscribbu"
	found=yes
	AC_MSG_RESULT([yes])
	break
    else
        AC_MSG_RESULT([no])
    fi
done

if test "$found" != "yes"; then
    AC_MSG_ERROR([couldn't find scribbu])
fi

# try the preprocessor and linker with our new flags,
# being careful not to pollute the global LIBS, LDFLAGS, and CPPFLAGS
AC_MSG_CHECKING([whether compiling and linking against scribbu will work])

save_LIBS="$LIBS"
save_LDFLAGS="$LDFLAGS"
save_CPPFLAGS="$CPPFLAGS"
LIBS="$SCRIBBU_LIBS $LIBS"
LDFLAGS="$SCRIBBU_LDFLAGS $LDFLAGS"
CPPFLAGS="$SCRIBBU_CPPFLAGS $CPPFLAGS"

AC_LANG_PUSH([C++])
AC_CHECK_HEADER([scribbu/scribbu.hh], [scribbu_hh=yes], [scribbu_hh=no])
# I'd like to do AC_CHECK_LIB here, but I can't link against libscribbu
# in a test because it, in turn depends on a bunch of other libs
AC_CHECK_FILE([${scribbu_home}/lib/libscribbu.la],
    [scribbu_la=yes], [scribbu_la=no])
AC_LANG_POP([C++])

LIBS="$save_LIBS"
LDFLAGS="$save_LDFLAGS"
CPPFLAGS="$save_CPPFLAGS"

if test "yes" = "$scribbu_hh" && test "yes" = "$scribbu_la"; then
    AC_DEFINE([HAVE_SCRIBBU], [1], [Define to 1 if you have libscribbu])
else
    AC_MSG_ERROR([az-tags requires scribbu])
fi

AC_SUBST([SCRIBBU_CPPFLAGS])
AC_SUBST([SCRIBBU_LIBS])
AC_SUBST([SCRIBBU_LDFLAGS])

The first step is to locate libscribbu. We will form the variable scribbu_dirs containing one or more directories to check. Now, the user could always just tell us where it is. That is the reason we begin with AC_ARG_WITH: if the user invokes configure with --with-scribbu=... we will just use that. Otherwise, we will examine a default set of locations.

That’s what the for look does; for each location in scribbu_dirs, it checks for scribbu.hh in a sub-directory named include/scribbu of the current location. On success, we set a few variables recording that result & break. If we check all locations without success, then we fail.

Now, just because we found a header file at a given place doesn’t mean we can biuld against it or its associated library. The typical idiom is to execute the macros AC_CHECK_HEADER and AC_CHECK_LIB to make sure we can include the header and link against the library, respectively.

The problem in my case is that AC_CHECK_LIB will fail, not through any fault of libscribbu, but because it depends on a number of other libraries; the test will fail with unresolved externals & I can’t see how to add the relevant link flags in the macro. Instead, I settle for AC_CHECK_FILE.

If both these pass, we know we’re good to go; the question remains: how to record the information we’ve just discovered? The Autoconf manual states that one should never add options to user variables such as CPPFLAGS. The idiom seems to be to define new variables that the Automake author can add to their rules. In this case, create three new variables:

SCRIBBU_CPPFLAGS to hold the -I option that will enable the build to find the libscribbu headers
SCRIBBU_LIBS to hold the the -L options that will enable the build to link against libscribbu
SCRIBBU_LDLAGS to hold any linker required flags

This lets us augment src/Makefile.am to:

bin_PROGRAMS = az-tags
az_tags_SOURCES = main.cc
AM_CPPFLAGS = $(BOOST_CPPFLAGS) $(SCRIBBU_CPPFLAGS)
AM_CXXFLAGS = -std=c++17 $(GUILE_CFLAGS)
AM_LDFLAGS = $(SCRIBBU_LDFLAGS) $(BOOST_LDFLAGS)
LDADD = $(SCRIBBU_LIBS)         \
        $(GUILE_LIBS)           \
	$(BOOST_SYSTEM_LIB)     \
	$(BOOST_FILESYSTEM_LIB) \
	$(BOOST_IOSTREAMS_LIB)  \
	$(OPENSSL_LIBS)

With that, we can configure:

$>: autoreconf -vfi
autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force
...
$>: ./configure --prefix=$HOME
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
...
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
$>: make
make  all-recursive
make[1]: Entering directory '/tmp/az-tags'
Making all in src
make[2]: Entering directory '/tmp/az-tags/src'
g++ -DHAVE_CONFIG_H -I. -I/home/mgh/doc/code/projects/az-tags/src -I..  -I/usr/include   -std=c++17 -pthread -I/usr/local/include/guile/2.2 -g -O2 -MT main.o -MD -MP -MF .deps/main.Tpo -c -o main.o /home/mgh/doc/code/projects/az-tags/src/main.cc
...

We have a build! Let us take a look at a file downloaded from Amazon.com:

$>: scribbu dump lorca.mp3
"lorca.mp3":
ID3v2.3(.0) Tag:
452951 bytes, synchronised
...
COMM (<no description>):
Amazon.com Song ID: 203558254
...
9425708 bytes of track data:
MD5: 48ff9cadea7d842e9059db25159d2daa
ID3v1.1: The Pogues - Lorca's Novena
Hell's Ditch [Expanded] (US Ve (track 5), 1990
Amazon.com Song ID: 20355825
unknown genre 255

$>: src/az-tags lorca.mp3
lorca.mp3 has 1 ID3v2 tags, and an ID3v1 tag
updating the comment frame containing Amazon.com Song ID: 203558254
all tags processed; emplacing new tagset...
emplacing new tagset...done.
clearing ID3v1 comment
$>: scribbu dump lorca.mp3
"lorca.mp3":
ID3v2.3(.0) Tag:
452951 bytes, synchronised
...
COMM (amazon.com song id):
Amazon.com Song ID: 203558254
...
9425708 bytes of track data:
MD5: 48ff9cadea7d842e9059db25159d2daa
ID3v1.1: The Pogues - Lorca's Novena
Hell's Ditch [Expanded] (US Ve (track 5), 1990

unknown genre 255

Next: Character Encodings, Previous: Using libscribbu, Up: scribbu [Contents][Index]

Frame Identifiers ¶

Symbol	2.3	2.4+
’album-frame	TAL	TALB
’artist-frame	TP1	TPE1
’band-frame	TP2	TPE2
’bpm-frame	TBP	TBPM
’comment-frame	COM	COMM
’composer-frame	TCM	TCOM
’conductor-frame	TP3	TPE3
’content-group-frame	TT1	TIT1
’copyright-frame	TCR	TCOP
’date-frame	TDA	TDAT
’encoded-by-frame	TEN	TENC
’file-owner-frame	N/A	TOWN
’file-type-frame	TFT	TFLT
’genre-frame	TCO	TCON
’initial-key-frame	TKE	TKEY
’interpreted-by-frame	TP4	TPE4
’isrc-frame	TRC	TSRC
’langs-frame	TLA	TLAN
’length-frame	TLE	TLEN
’lyricist-frame	TXT	TEXT
’media-type-frame	TMT	TMED
’original-album-frame	TOT	TOAL
’original-artist-frame	TOA	TOPE
’original-filename-frame	TOF	TOFN
’original-lyricist-frame	TOL	TOLY
’original-release-year-frame	TOR	TORY
’part-of-a-set-frame	TPA	TPOS
’play-count-frame	CNT	PCNT
’playlist-delay-frame	TDY	TDLY
’pop-frame	POP	POPM
’publisher-frame	TPB	TPUB
’recording-dates-frame	TRD	TRDA
’settings-frame	TSS	TSSE
’size-frame	TSI	TSIZ
’station-name-frame	N/A	TRSN
’station-owner-frame	N/A	TRSO
’subtitle-frame	TT3	TIT3
’tag-cloud-frame	XTG	XTAG
’time-frame	TIM	TIME
’title-frame	TT2	TIT2
’track-frame	TRK	TRCK
’udt-frame	TXX	TXXX
’year-frame	TYE	TYER

Next: Function Index, Previous: Frame Identifiers, Up: scribbu [Contents][Index]

Character Encodings ¶

scribbu uses iconv for character encoding. For convenience, here is the list of identifiers used to name them:

European & Russian languages
ASCII, ISO_8859_1, ISO_8859_2, ISO_8859_3, ISO_8859_4, ISO_8859_5, ISO_8859_7, ISO_8859_9, ISO_8859_10, ISO_8859_13, ISO_8859_14, ISO_8859_15, ISO_8859_16, KOI8_R, KOI8_U, KOI8_RU, CP1250, CP1251, CP1252, CP1253, CP1254, CP1257, CP850, CP866, CP1131, MacRoman, MacCentralEurope, MacIceland, MacCroatian, MacRomania, MacCyrillic, MacUkraine, MacGreek, MacTurkish, Macintosh
Semitic languages ISO_8859_6, ISO_8859_8, CP1255, CP1256, CP862, MacHebrew, MacArabic
Japanese EUC_JP, SHIFT_JIS, CP932, ISO_2022_JP, ISO_2022_JP_2, ISO_2022_JP_1, ISO_2022_JP_MS
Chinese EUC_CN, HZ, GBK, CP936, GB18030, EUC_TW, BIG5, CP950, BIG5_HKSCS, BIG5_HKSCS_2004, BIG5_HKSCS_2001, BIG5_HKSCS_1999, ISO_2022_CN, ISO_2022_CN_EXT
Korean EUC_KR, CP949, ISO_2022_KR, JOHAB
Armenian ARMSCII_8
Georgian Georgian_Academy, Georgian_PS
Tajik KOI8_T
Kazakh PT154, RK1048
Thai TIS_620, CP874, MacThai
Laotian MuleLao_1, CP1133
Vietnamese VISCII, TCVN, CP1258
Platform specifics HP_ROMAN8, NEXTSTEP
Full Unicode UTF_8, UCS_2, UCS_2BE, UCS_2LE, UCS_4, UCS_4BE, UCS_4LE, UTF_16, UTF_16BE, UTF_16LE, UTF_32, UTF_32BE, UTF_32LE, UTF_7, C99, JAVA

Next: Index, Previous: Character Encodings, Up: scribbu [Contents][Index]

Function Index ¶

Jump to:	G H R W

G
	Index Entry	Section

	`get-frames`	Miscellaneous Functions

H
	`has-frame?`	Miscellaneous Functions

R
	`read-id3v1-tag`	The Scheme REPL
	`read-tagset`	The Scheme REPL

W
	`with-track-in`	Miscellaneous Functions
	`write-id3v1-tag`	Writing Scheme Programs with `scribbu`
	`write-tagset`	Writing Scheme Programs with `scribbu`

Previous: Function Index, Up: scribbu [Contents][Index]

Index ¶

Jump to:	< B F I M S T U

<
	Index Entry	Section

	`<comment-frame>`	`<comment-frame>`
	`<id3v2-tag>`	`<id3v2-tag>`
	`<play-count-frame>`	`<play-count-frame>`
	`<popm-frame>`	`<popm-frame>`
	`<tag-cloud-frame>`	`<tag-cloud-frame>`
	`<text-frame>`	`<text-frame>`
	`<unk-frame>`	`<unk-frame>`
	`<user-defined-text-frame>`	`<user-defined-text-frame>`

B
	Building a `libscribbu` program	Building a `libscribbu` program
	Building `az-tags`	Building `az-tags`

F
	File-based Replacements	File-based Replacements

I
	ID3v2 Frames	ID3v2 Frames
	ID3v2 Serialization	ID3v2 Serialization
	ID3v2 tags	ID3v2 tags
	Implementing `az-tags`	Implementing `az-tags`
	Introduction	Introduction
	Invoking `scribbu dump`	Invoking `scribbu dump`
	Invoking `scribbu encodings`	Invoking `scribbu encodings`
	Invoking `scribbu genre`	Invoking `scribbu genre`
	Invoking `scribbu m3u`	Invoking `scribbu m3u`
	Invoking `scribbu popm`	Invoking `scribbu popm`
	Invoking `scribbu rename`	Invoking `scribbu rename`
	Invoking `scribbu report`	Invoking `scribbu report`
	Invoking `scribbu text`	Invoking `scribbu text`
	Invoking `scribbu xtag`	Invoking `scribbu xtag`

M
	Miscellaneous Functions	Miscellaneous Functions

S
	scribbu	scribbu
	`scribbu dump` Options	`scribbu dump` Options
	`scribbu genre` Behavior on Write	`scribbu genre` Behavior on Write
	`scribbu genre` Options	`scribbu genre` Options
	`scribbu m3u` Options	`scribbu m3u` Options
	`scribbu popm` Options	`scribbu popm` Options
	`scribbu rename` Options	`scribbu rename` Options
	`scribbu report` Options	`scribbu report` Options
	`scribbu text` Options	`scribbu text` Options
	`scribbu xtag` Options	`scribbu xtag` Options
	Scripting `scribbu`	Scripting `scribbu`

T
	Tag-Based Replacements	Tag-Based Replacements
	Text Encoding	Text Encoding
	The Scheme REPL	The Scheme REPL
	The `scribbu` Program	The `scribbu` Program
	The Unsynchronisation Scheme	The Unsynchronisation Scheme

U
	Using `libscribbu`	Using `libscribbu`

scribbu ¶

Table of Contents

1 Introduction ¶

1.1 scribbu ¶

1.2 Project Background ¶

1.2.1 MP3 ¶

1.2.2 ID3 ¶

1.2.3 Winamp ¶

1.2.4 Today ¶

2 The scribbu Program ¶

2.1 Example ¶

2.1.1 scribbu dump ¶

2.1.2 scribbu report ¶

2.1.3 scribbu popm ¶

2.1.4 scribbu rename ¶

2.2 Invoking scribbu dump ¶

2.2.1 scribbu dump Options ¶

2.3 Invoking scribbu encodings ¶

2.4 Invoking scribbu genre ¶

2.4.1 scribbu genre Behavior on Write ¶

2.4.2 scribbu genre Options ¶

2.5 Invoking scribbu m3u ¶

2.5.1 scribbu m3u Options ¶

2.6 Invoking scribbu report ¶

2.6.1 scribbu report Options ¶

2.7 Invoking scribbu rename ¶

2.7.1 scribbu rename Options ¶