G+_Lee Crocker Posted May 7, 2014 Share Posted May 7, 2014 Computers use numbers. Human beings prefer things like text and pictures. Computers can deal with those things, and many languages have special features for doing so, but a common mistake made by novice programmers is to use those features for a program's internal data instead of using simpler data types internally and only translating for humans when interacting with humans. At the start of a new programming project, it is well worth taking some time to consider how to use numbers to represent your program's data. For example, say that your program is tracking orders of products from your small business. Each order will likely be in one of a small number of states: Just placed by customer, credit card approved, packaged for shipping, shipped, received, etc. You could (and many beginners would) store in each order record a text string saying "new", "shipped", etc. But a better solution is to list all these states and give each one a unique number, say, 1 to 10 (however many you want to track). This has many advantages. First, it uses your computer's resources (memory, disk space) much more efficiently. Secondly, your programs will run faster because whenever it needs to compare its state with something, it will be comparing simple small numbers instead of big text strings. Your program will still need to contain those strings to tell humans about them, and to label the program's buttons and choose-lists for humans, but that text will be in the program, not in the data. This will also make it easier to change the text for things like a new language. It will make the program better organized and easier to understand and update. For example, consider this: if order.status == "new": do_new_things(order) elif order.status == "packaged": do_packaged_things(order) else: . . . How would you tell from the code if every status is handled? If you added a new status, would updating this code be enough, or is there somewhere else in the program you also need to change. In contrast, imagine something like this: status_names = { 1: "new", 2: "approved", 3: "packaged", . . . } status_handlers = { 1: do_new_things, 2: do_approved_things, . . . } In this code, I can see at a glance how many total statuses there are, I can see if any are missing from the list of handlers, I can dispatch the handlers with a quick pick-from-a-list rather than a long slow series of comparisons. I can easily change the names or add new statuses without breaking too much of the program. Obviously strings are necessary for things like names and actual paragraphs of prose. But the data your program manipulates should be organized for its benefit. That's your job as a programmer: to turn things you understand like invoices and order forms into what the computer understands: numbers. Link to comment Share on other sites More sharing options...
G+_Wesley Kerfoot Posted May 7, 2014 Share Posted May 7, 2014 "But a better solution is to list all these states and give each one a unique number, say, 1 to 10" FYI, Python has the ability to create enumerated types (like in C or Haskell): http://legacy.python.org/dev/peps/pep-0435/ Link to comment Share on other sites More sharing options...
G+_Lee Crocker Posted May 7, 2014 Author Share Posted May 7, 2014 Yes, enumerated types are good if your language supports them (they're pretty new in Python), and if you know they are implemented as integers. But I've seen programmers use strings for things that are clearly numbers, but not enumerations, like playing cards.? And you really should understand how to do it at a lower level.? After all, enumerations are just a programming language's way of allowing the human programmer to use human-readable names at programming time that will become small integers at runtime. Link to comment Share on other sites More sharing options...
G+_Jayunderwood Kent Posted May 8, 2014 Share Posted May 8, 2014 but there is a famous saying in the world of computer science. Programs are not written for computers but for people. I understand the need to use int to represent certain data type but not all the time. This is because it makes it harder to update the program. Two years letter you be asking yourself what does that int variable mean, why did I use? For me I prefer using the right data structure to represent the data e.g tuples, dictionary, list, etc,etc. Using a good data struture to represent the data is a job half done already Link to comment Share on other sites More sharing options...
G+_Wesley Kerfoot Posted May 8, 2014 Share Posted May 8, 2014 Jayunderwood Kent Yeah Donald Knuth said that, but Alan Perlis also said: "Make no mistake about it: Computers process numbers - not symbols. We measure our understanding (and control) by the extent to which we can arithmetize an activity." Which one do you trust more? :) I would wager you can reconcile both statements. Link to comment Share on other sites More sharing options...
G+_Lee Crocker Posted May 8, 2014 Author Share Posted May 8, 2014 ... And of course you need higher-level structures as well, like lists and dictionaries. But if you don't understand how they are actually implemented under-the-hood as vectors, trees, hash tables, and so on, then you won't know which is best for your application. I'm all in favor of high-level programming--I love object-composition patterns and abstract types and closures and all that good stuff. But in the end, you're still pushing bytes around in memory, and you need to know what the code is really doing.? Link to comment Share on other sites More sharing options...
G+_Jayunderwood Kent Posted May 8, 2014 Share Posted May 8, 2014 well when you put it that way it makes perfect sense. I believe in order for a programmer to be good he needs to understand what happens at a lower level as well at a higher level. They need to understand how data is represented in memory or as they say underneath the hood. So in that sense I do agree with you because most programmers do not understand. Link to comment Share on other sites More sharing options...
G+_abby Sand Posted May 8, 2014 Share Posted May 8, 2014 Understanding how a program works underneath the hood will make you a better programmer. This allows the programmer to be more efficient with their computer resources. And second of all they have a low level understanding of how data is represented within the computer's memory. At the end of the day "as quoted by Lee Crocker we are just pushing bytes around within memory. And it very important that a programmer understand the low level mechanics of a program as well as the high level ones. But hey that just my opinion. Many novice programmers do not. They represent their data using the wrong data structure. Such as strings, etc, etc when in fact a different data structure would be better implemented. Like Jayunderwood Kent mentioned the right data structure is half the problem solved. Link to comment Share on other sites More sharing options...
G+_Darryl Medley Posted May 8, 2014 Share Posted May 8, 2014 Speaking as a professional business applications developer, I agree that you should use small codes in the database and apply full descriptions for display purposes. But numbers aren't always smaller than strings. In many databases, an integer field will use 4 bytes of space but a 2 character string will only take 2 or possibly 3 bytes if the DB stores the length. Plus using strings gives greater flexibility in values. Here's a sample of some of my 2 char order status codes: 00 = New, 10 = Printed, CH = Credit Hold. Variables in programs that hold some kind of state / status should definitely be enums, numbers, bools, etc., not strings, and this should be taught in the podcast. Again, understanding how the computer really processes data and being able to use the best code for the job. Link to comment Share on other sites More sharing options...
Recommended Posts