G+_Darryl Medley Posted May 2, 2014 Share Posted May 2, 2014 https://github.com/Darryl-Medley/Coding-101/blob/master/ReadSortWrite.py A simple demo program that shows how to read a comma-delimited text file, sort on any column, then write it out to a new file. The input data is in a file called name_list.txt with the format: Name, Age, Zipcode. For example: Padre, 30, 9000 Shannon, 21, 9100 Darryl, 50, 9200 Link to test file: https://github.com/Darryl-Medley/Coding-101/blob/master/name_list.txt https://github.com/Darryl-Medley/Coding-101/blob/master/ReadSortWrite.py Link to comment Share on other sites More sharing options...
G+_L I Posted May 2, 2014 Share Posted May 2, 2014 I tried downloading and running this and I think ended up breaking it because of an extra blank line in the text file. But it gave me the chance to step through and decode and debug. That sort key/itemgetter syntax is pretty neat especially for the fact that it works on objects too. https://wiki.python.org/moin/HowTo/Sorting I'm trying to go through as many examples as I can so I can start seeing the patterns used - especially for list comprehension - and this really helps. Link to comment Share on other sites More sharing options...
G+_Darryl Medley Posted May 2, 2014 Author Share Posted May 2, 2014 This program kind of assumes that the data file format is good. I wanted to focus on the file I/O and sorting routines. After uploading this I realized that there is a bug in the sort logic that doesn't show up with the test data I'm using. Can anyone spot it? Link to comment Share on other sites More sharing options...
G+_Jayunderwood Kent Posted May 2, 2014 Share Posted May 2, 2014 wow I got to admit line 37 was a pretty sweet use of list comprehesions. But I have got to ask you why you assign it to another list within the list comphrension. My understanding of list comphrensension is that when executed it always returns a list hence the name list comprehensions. why not like this: [fld.strip() for fld in s.split(",") for s in infoList] it will still return a list [value1, value2] and the fld.strip() will strip out all new lines and store it within the same list. When you do this way: [[fld.strip() for fld in s.split(",")] for s in infoList] you get: s = [[value1, value2, value3]] then you have to run two for loops to get the data inside. for list in s: for data in s: the first for loop will return the list and the second will return the data. I just like to know your though patterns Link to comment Share on other sites More sharing options...
G+_Darryl Medley Posted May 2, 2014 Author Share Posted May 2, 2014 Creating a sub-list for each string in the main list allows the "sort on any column" using key=itemgetter to work properly. (Which I discovered the hard way.) I can't take credit for the nested list comprehensions. I found that on StackOverflow after my attempts to guess the correct syntax failed. Link to comment Share on other sites More sharing options...
G+_Jayunderwood Kent Posted May 2, 2014 Share Posted May 2, 2014 yeah that makes senses since key=itemgetter function works on list Link to comment Share on other sites More sharing options...
G+_Jayunderwood Kent Posted May 2, 2014 Share Posted May 2, 2014 StackOverflow a programmers friend Link to comment Share on other sites More sharing options...
G+_L I Posted May 2, 2014 Share Posted May 2, 2014 Is the bug that the sort would be comparing string numbers rather than integer numbers for the second and third columns? Link to comment Share on other sites More sharing options...
G+_L I Posted May 2, 2014 Share Posted May 2, 2014 so I added a function: def retype(value): if value.isdigit(): return int(value) else: return value and changed the sort to: infoList.sort(key=lambda k:retype(k[int(optn[0])-1])) and it seems to be working with numbers of different digit lengths, but I'm not sure if that's the most efficient way it could have been done. Link to comment Share on other sites More sharing options...
G+_Darryl Medley Posted May 2, 2014 Author Share Posted May 2, 2014 That's the bug! I'm treating numbers as strings so "3" will be greater than "29" when it shouldn't be. It's really just for the Age column since Zipcodes are always 5 digits. I think the most efficient thing to do would be just to convert the Age to a number in the list and not change the sort but I'd like to see other solutions. Link to comment Share on other sites More sharing options...
G+_L I Posted May 3, 2014 Share Posted May 3, 2014 Yeah - converting the data properly when reading it in would probably be the best idea in case you had to do something else later on that relied on the data in those columns being integers. Link to comment Share on other sites More sharing options...
G+_Darryl Medley Posted May 12, 2014 Author Share Posted May 12, 2014 Updated the program so it sorts all ages correctly and skips any blank lines. Link to comment Share on other sites More sharing options...
Recommended Posts