Jump to content

https: github com Darryl-Medley Coding-101 blob master ReadSortWrite py


G+_Darryl Medley
 Share

Recommended Posts

https://github.com/Darryl-Medley/Coding-101/blob/master/ReadSortWrite.py

A simple demo program that shows how to read a comma-delimited text file, sort on any column, then write it out to a new file. The input data is in a file called name_list.txt with the format: Name, Age, Zipcode. For example:

Padre, 30, 9000

Shannon, 21, 9100

Darryl, 50, 9200

 

Link to test file:

https://github.com/Darryl-Medley/Coding-101/blob/master/name_list.txt

https://github.com/Darryl-Medley/Coding-101/blob/master/ReadSortWrite.py

Link to comment
Share on other sites

I tried downloading and running this and I think ended up breaking it because of an extra blank line in the text file. But it gave me the chance to step through and decode and debug. That sort key/itemgetter syntax is pretty neat especially for the fact that it works on objects too.

 

https://wiki.python.org/moin/HowTo/Sorting

 

I'm trying to go through as many examples as I can so I can start seeing the patterns used - especially for list comprehension - and this really helps.

Link to comment
Share on other sites

This program kind of assumes that the data file format is good. I wanted to focus on the file I/O and sorting routines.

After uploading this I realized that there is a bug in the sort logic that doesn't show up with the test data I'm using. Can anyone spot it?

Link to comment
Share on other sites

wow I got to admit line 37 was a pretty sweet use of list comprehesions.

 

But I have got to ask you why you assign it to another list within the list comphrension. My understanding of list comphrensension is that when executed it always returns a list hence the name list comprehensions.

 

why not like this:

 

 [fld.strip() for fld in s.split(",") for s in infoList]

 

it will still return a list [value1, value2] and the fld.strip() will strip out all new lines and store it within the same list.

 

When you do this way:

 

[[fld.strip() for fld in s.split(",")] for s in infoList]

 

you get:

s = [[value1, value2, value3]]

 

then you have to run two for loops to get the data inside.

 

for list in s:

   for data in s:

       

 

the first for loop will return the list and the second will return the data.

 

 

I just like to know your though patterns

Link to comment
Share on other sites

Creating a sub-list for each string in the main list allows the "sort on any column"  using key=itemgetter to work properly. (Which I discovered the hard way.) I can't take credit for the nested list comprehensions. I found that on StackOverflow after my attempts to guess the correct syntax failed.

Link to comment
Share on other sites

so I added a function: 

 

def retype(value):

    if value.isdigit():

        return int(value)

    else:

        return value

 

and changed the sort to:

 

infoList.sort(key=lambda k:retype(k[int(optn[0])-1]))

 

and it seems to be working with numbers of different digit lengths, but I'm not sure if that's the most efficient way it could have been done.

Link to comment
Share on other sites

That's the bug! I'm treating numbers as strings so "3" will be greater than "29" when it shouldn't be. It's really just for the Age column since Zipcodes are always 5 digits. I think the most efficient thing to do would be just to convert the Age to a number in the list and not change the sort but I'd like to see other solutions.

Link to comment
Share on other sites

Yeah - converting the data properly when reading it in would probably be the best idea in case you had to do something else later on that relied on the data in those columns being integers.

Link to comment
Share on other sites

  • 2 weeks later...
 Share

×
×
  • Create New...