G+_Sean Miller Posted September 17, 2018 Share Posted September 17, 2018 Question--something I've always idly wondered: Is a byte ALWAYS 8 bits? On a 64-bit operating system, shouldn't a byte be 64 bits? I have always thought of a byte as a single character. And more bits allows for more characters, like drawing with a 64-crayon box instead of 16-crayons. If it's in terms of the size of your alphabet, UTF-8 is a 32 bit encoding scheme. It takes 32 bits to represent a single character. So if it's based on the size of the alphabet, then a byte would be 32 bits for utf-8 and 7 bits for ASCII. Is a byte = 8 bits an outdated idea? Link to comment Share on other sites More sharing options...
G+_Michael Hagberg Posted September 17, 2018 Share Posted September 17, 2018 A byte is always 8 bits. 64 bits is a word. As is 16 or 32 bits depending on the CPU. Link to comment Share on other sites More sharing options...
G+_J Miller Posted September 17, 2018 Share Posted September 17, 2018 But a shave and a haircut are only 2 bits. Link to comment Share on other sites More sharing options...
G+_Tailsthefox Pelissier Posted September 17, 2018 Share Posted September 17, 2018 Bit means smallest pice of of information. Today that old war of 8bit, 16bit, 32 bit, 64 bit and 128 bit that bullied kids for years on school play grounds has no meaning. In the case of PCs and Macs it does have use-full option the RAM. Link to comment Share on other sites More sharing options...
G+_Black Merc Posted September 17, 2018 Share Posted September 17, 2018 Michael Hagberg is right. The 64bit processors really just means that it can access more addresses(such as more ram, remember the 4gb ram of days past) and the size of numbers it can crunch without taking more compute cycles. Some will mention 'instruction set' but that hasn't really changed. Link to comment Share on other sites More sharing options...
G+_Paul Hutchinson Posted September 17, 2018 Share Posted September 17, 2018 Michael Hagberg Actually a word is always 16 bits, 32 bits is dword (double word) and 64 bits is a qword (quad word). I suspect you are conflating a word with a "C" integer. In standard "C" an integer is either a word, dword, or qword depending on the register size of the microprocessor. Link to comment Share on other sites More sharing options...
G+_Sean Miller Posted September 17, 2018 Author Share Posted September 17, 2018 THanks very much for the info. Link to comment Share on other sites More sharing options...
G+_Jeff Gros Posted September 17, 2018 Share Posted September 17, 2018 Paul Hutchinson I think you and Michael Hagberg are talking about different things. You seem to be talking about the WORD, DWORD programming definitions, Michael is talking about computer architecture. In computer architecture, a word is the "largest natural size for arithmetic", which is the size of the registers. This is typically the size of the data bus, but doesn't have to be. When we refer to a 64 bit OS, we refer to the size of the data bus. The more data we can transfer at once, the more efficient the data transaction is. On ARM Cortex-M, the word size is 32 bits. On MSP430 it is 16 bits. On PIC/Atmel, it is 8 bits. I'm not 100% sure either way whether WORD, DWORD are actually part of the C standard, as they refer to int, long int, long long, etc. instead. I'll have to look in my copy of the "C Reference Manual" (Harbison and Steele) when I get to work. Speaking of work, I need to stop typing and get out of here! Cheers! Link to comment Share on other sites More sharing options...
G+_Paul Hutchinson Posted September 17, 2018 Share Posted September 17, 2018 Jeff Gros Thanks for the info, didn't realize there are different definitions from different areas of computer science for the same term. I always thought data bus size, register size, and bit length didn't create any confusing contradictory definitions and word was only used for bit length = 16. Link to comment Share on other sites More sharing options...
G+_Giligain I. Posted September 17, 2018 Share Posted September 17, 2018 Song time: [Cubico AR Kids Coding] Coding Song "Zero One Song~?" Link to comment Share on other sites More sharing options...
G+_Jeff Gros Posted September 18, 2018 Share Posted September 18, 2018 Paul Hutchinson If you want to be really confused, look back at the history of computing. This stuff evolved organically, and there were no standards. Computers were expensive, so every bit counted. Modern computers use 8 bit bytes. It didn't used to be that way! You could have 4 bit bytes, 6 bit bytes, 7 bit bytes, or whatever else you could dream up. I think the strangest device I've come across (that is still relatively modern) is an 4-bit microcontroller used to control a LCD display for a watch. Link to comment Share on other sites More sharing options...
G+_Paul Hutchinson Posted September 18, 2018 Share Posted September 18, 2018 Jeff Gros No wonder so many scientists from other branches say CS is NOT really a science. They can't even establish a consistent vocabulary and metrology. Link to comment Share on other sites More sharing options...
G+_Jeff Gros Posted September 18, 2018 Share Posted September 18, 2018 Sean Miller Also, looking back at the comments, I think we all glossed over that bit at the end of your question about byte encoding. UTF-8 is actually not fixed at 32 bits, it is variable length. You are allowed to use 1 - 4 bytes in each character. This is done for backwards compatibility with 7-bit ASCII, and to allow larger character sets, as needed. However, whether the character is always treated as a 4 byte character (even when only 1 is needed to represent it), is of course, application dependent. I wouldn't be surprised if applications just use 4 byte characters for everything for simplicity in implementation. Link to comment Share on other sites More sharing options...
G+_Sean Miller Posted September 18, 2018 Author Share Posted September 18, 2018 Thank you very much for the responses. So if I understand this correctly, a byte is always 8 bits, but a word can have different lengths depending on the operating system. Link to comment Share on other sites More sharing options...
G+_Giligain I. Posted September 18, 2018 Share Posted September 18, 2018 Common Computer Conversions http://www.beesky.com/newsite/bit_byte.htm • yep, 2 bits = 0.25 bytes Link to comment Share on other sites More sharing options...
G+_Giligain I. Posted September 18, 2018 Share Posted September 18, 2018 #huh Nibble = half-byte https://en.m.wikipedia.org/wiki/Nibble In computing, a nibble (occasionally nybble or nyble to match the spelling of byte) is a four-bit aggregation, or half an octet. It is also known as half-byte or tetrade. In a networking or telecommunication context, the nibble is often called a semi-octet, quadbit, or quartet. A nibble has sixteen (2^4) possible values. A nibble can be represented by a single hexadecimal digit and called a hex digit. Link to comment Share on other sites More sharing options...
G+_Paul Hutchinson Posted September 19, 2018 Share Posted September 19, 2018 Sean Miller It appears that a byte can also be other lengths according to what Jeff Gros posted. However in the 21st century it's usually: Nibble = 4 bits Byte = 8 bits Word = 16 bits Double Word = 32 bits Quad Word = 64 bits Link to comment Share on other sites More sharing options...
G+_Jeff Gros Posted September 19, 2018 Share Posted September 19, 2018 I don't mean to pick nits here, but to be absolutely clear we need to specify whether we are talking about data types from a specific language, or computer architecture. If you are using a windows machine, and a C/C++ compiler, then the size definitions defined below (which define WORD, DWORD, etc) are applicable: docs.microsoft.com - Windows Data Types | Microsoft Docs This would align with the sizes specified by Paul Hutchinson. This differs from the computer architecture term of "word", which, as I explained previously could be anything, but is probably 8 bits, 16 bits, 32 bits, or 64 bits. If you are on a desktop, it is probably 32 or 64. If you are on a microcontroller, it could be any of them. The possible ambiguity of word, dword, etc, is the reason that I don't use these definitions when writing code. I use the standard types instead (stdint.h): uint8_t, uint16_t, etc. It is common for me to write microcontroller code that sends data to a windows app. If I use the standard type definitions in stdint.h, I don't have to worry as much about portability. I can share serial packet processing code, and it will just work. Link to comment Share on other sites More sharing options...
Recommended Posts