1
votes

I am trying to read MIDI music files and processing them a bit using the music21 library. I am using the self defined read_midi function, and getting this error "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 10: invalid start byte"

import os
#Array Processing
import numpy as np

#specify the path
path='audio/'

#read all the filenames
files=[i for i in os.listdir(path) if i.endswith(".mid")]

#reading each midi file
notes_array = np.array([read_midi(path+i) for i in files])

here is the read_midi function:

def read_midi(file):

print("Loading Music File:",file)

notes=[]
notes_to_parse = None

#parsing a midi file
midi = converter.parse(file)

#grouping based on different instruments
s2 = instrument.partitionByInstrument(midi)

#Looping over all the instruments
for part in s2.parts:

    #select elements of only piano
    if 'Piano' in str(part): 
    
        notes_to_parse = part.recurse() 
  
        #finding whether a particular element is note or a chord
        for element in notes_to_parse:
            
            #note
            if isinstance(element, note.Note):
                notes.append(str(element.pitch))
            
            #chord
            elif isinstance(element, chord.Chord):
                notes.append('.'.join(str(n) for n in element.normalOrder))

return np.array(notes)

kindly suggest how can I get rid of this error.

1
Apparently, it tries to read characters from a binary file. Looks like a bug in your version of music21. - CL.
ya I guess so, checked there issues page it says similar stuff. thanks for the response. - DhruvStan7
also, i discovered a strange thing, i changed some of the MIDI files and now it is working, it reads some files and get stuck at some other. - DhruvStan7

1 Answers

3
votes

An answer I got from the music21 Google Groups and fixed my problem :

HI, and thanks for the report. This is a regression caused by a new feature in 6.1.0 that creates Instrument objects from the text of MIDI track names. It's fixed in the next unreleased version (likely to be 6.2.0), which is available now on GitHub. If that's too cumbersome to install, you can also just edit your own copy of music21 to apply the fix found here: https://github.com/cuthbertLab/music21/pull/607/files

For the curious, the original feature wrongly assumed all MIDI track names would be encoded using utf-8. The files we found to fail each had a copyright symbol in the track name, and they were each created by "www.piano-midi.de". Would you mind sharing what MIDI writer created your file?

Also, I would very much appreciate you sharing this answer on Stack Overflow, since I'm not active there.

Cheers, and happy music21-ing,