Placement Prep

Tokens in C: Keywords, Identifiers, Constants, and Operators Explained

Every C program breaks down into six token types. Here is what each one does, the rules that govern each, and the mistakes that trip up students in placement tests.

By FACE Prep Team 8 min read
c-programming tokens-in-c keywords identifiers operators placement-prep

A token is the smallest meaningful unit of a C program, the atomic piece the compiler works with before it understands anything else about your code.

Six categories cover every token in C: keywords, identifiers, constants, strings, special symbols, and operators. Get these wrong and the compiler never reaches your logic. Get them right and you can catch an entire class of placement-test traps before they trip you.

What Is a Token in C?

When a C compiler reads source code, the first step is lexical analysis: breaking the raw text stream into tokens. Before the compiler checks whether your logic makes sense, it checks whether each token is valid. This is why a misspelled keyword produces a compile error before any runtime logic runs.

Consider this one-liner:

int sum = a + b;

The lexer sees seven tokens: int (keyword), sum (identifier), = (assignment operator), a (identifier), + (arithmetic operator), b (identifier), and ; (special symbol). Each belongs to exactly one token type. The compiler applies different parsing rules to each type, which is why int and sum look identical to a human but behave very differently to the compiler.

Keywords: C’s 32 Reserved Words

Keywords are words the C language has claimed for itself. The compiler assigns each a fixed meaning, and no program may reassign that meaning by using a keyword as a variable or function name. All 32 ANSI C keywords are lowercase:

CategoryKeywords
Data typesauto, char, double, float, int, long, short, signed, unsigned, void
Storage classextern, register, static
Control flowbreak, case, continue, default, do, else, for, goto, if, return, switch, while
Type qualifiersconst, volatile
Type definitionenum, struct, typedef, union
Size operatorsizeof

That accounts for all 32. C99 added restrict, inline, _Bool, _Complex, and _Imaginary. C11 added a further seven (_Alignas, _Alignof, _Atomic, _Generic, _Noreturn, _Static_assert, _Thread_local). Placement MCQs almost always reference the C89 count of 32.

The Case-Sensitivity Trap

Keywords in C are strictly lowercase. int is a keyword. Int is a valid identifier. INT is also a valid identifier. This single rule generates at least two or three questions on every C-heavy aptitude test:

  • Q: Which of these is NOT a keyword in C? Options: int, FLOAT, break, void.
  • Answer: FLOAT is not a keyword. float (lowercase) is. FLOAT is a valid user-defined identifier.

If you see an option that looks like a keyword but has any uppercase letter, it is an identifier, not a keyword.

Identifiers: The Naming Rules You Cannot Skip

Identifiers are programmer-chosen names for variables, functions, arrays, and labels. The compiler accepts an identifier only if it satisfies four rules. Per the C identifier specification:

  • Must begin with a letter (a-z, A-Z) or _
  • After the first character, may contain only letters, digits (0-9), or _
  • Must not match any keyword exactly (case-sensitive comparison)
  • Must not contain whitespace or any other special character

In C89, only the first 31 characters of an identifier are significant. Two names that are identical in the first 31 characters but differ after that are treated as the same identifier. C99 extends this to 63 characters for identifiers with internal linkage.

IdentifierValid?Reason
countValidStarts with letter, all alphanumeric
_totalValidLeading _ is allowed
student_2026Valid_ and digits after the first character are fine
2fastInvalidStarts with a digit
my-varInvalidHyphen is not a permitted character
floatInvalidMatches the keyword float exactly
MAX_SIZEValidUppercase letters and _ allowed

Identifiers are case-sensitive. count, Count, and COUNT are three distinct identifiers. This is a common source of bugs at test time. For a systematic look at where naming and scoping mistakes surface in submitted code, the common C programming errors guide covers the patterns that appear most in campus assessments.

Constants and Strings

Constants

A constant holds a fixed value that cannot change during execution. C supports four constant types:

  • Integer constants: Whole-number values in decimal (42), octal (prefix 0, e.g., 052 equals 42), or hexadecimal (prefix 0x, e.g., 0x2A equals 42).
  • Floating-point constants: Written as 3.14 (decimal) or 3.14e2 (exponential notation for 314.0).
  • Character constants: A single character in single quotes ('A'). Stored as its ASCII integer value — 'A' equals 65, '0' equals 48.
  • Enumeration constants: Named integer values declared with enum, such as enum Color { RED, GREEN, BLUE };.

The const keyword and #define directive are the two mechanisms for enforcing constants in practice:

#define PI 3.14159
const int MAX = 100;

#define performs text substitution before compilation (no type, no memory). const creates a typed, memory-resident variable that the compiler protects from modification. The distinction matters for pointer arithmetic and for type-checked assignments: you can create a const int * pointer (pointer to a constant int), which #define does not support.

Strings

A string in C is a sequence of characters stored in a char array, terminated by a null character ('\0'). String literals use double quotes:

char name[] = "FACE";

The array name holds five characters: 'F', 'A', 'C', 'E', '\0'. The null terminator is added automatically for string literals. Functions such as strlen() and printf() with %s stop processing when they encounter '\0'.

The distinction to know for tests:

  • 'A' is a character constant (type int, value 65)
  • "A" is a string literal (a char array: {'A', '\0'}, two bytes in memory)

These are not interchangeable. Assigning a string literal to a char variable instead of a char[] is a common error that the compiler will warn about.

Special Symbols

Special symbols are non-alphanumeric characters assigned fixed syntactic roles by the C language. They are not operators (they do not compute a result), but each one changes how the compiler parses what surrounds it.

SymbolRole
[]Array subscript — arr[i] accesses element at index i
()Function call and grouping — printf(...), (a + b) * c
{}Block delimiter — marks the start and end of a compound statement
,Separator — separates function arguments and multiple declarations
;Statement terminator — ends every executable statement
*Pointer declaration and dereference (context-dependent)
=Assignment — copies the right-hand value into the left-hand variable
#Preprocessor directive marker — #include, #define, #ifdef

The * symbol deserves a note: in a declaration it marks a pointer type (int *p), and in an expression it dereferences a pointer (*p = 10). Same character, two distinct roles depending on syntactic context. This ambiguity is a classic multiple-choice setup. The pointers and arrays in C guide covers the full set of pointer contexts in which * appears.

Operators

An operator is a symbol that triggers a computation on one or more values. Those values are called operands. C classifies operators by how many operands they require.

Unary Operators (one operand)

OperatorNameExample
++Incrementi++ (post-increment), ++i (pre-increment)
--Decrementi-- (post-decrement), --i (pre-decrement)
-Unary minus-x negates x
!Logical NOT!flag evaluates to 1 if flag is 0
~Bitwise NOT~mask flips all bits
*Dereference*ptr reads the value at the address in ptr
&Address-of&var returns the memory address of var
sizeofSize in bytessizeof(int) returns 4 on most 32-bit systems

Binary Operators (two operands)

Binary operators take two operands. They divide into sub-categories:

  • Arithmetic: +, -, *, /, % (modulo remainder)
  • Relational: ==, !=, <, >, <=, >=. Return 0 (false) or 1 (true).
  • Logical: && (AND), || (OR). Short-circuit: the right operand is not evaluated if the left determines the result.
  • Bitwise: &, |, ^ (XOR), << (left shift), >> (right shift). Operate on individual bits.
  • Assignment: = (simple), plus compound forms +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=.

Ternary Operator (three operands)

C has exactly one ternary operator: the conditional ?:.

int max = (a > b) ? a : b;

The condition (a > b) is evaluated. If true, the expression returns a; if false, it returns b. The ternary operator produces a value, which is what distinguishes it from an if-else statement.

Operator Precedence Traps

Precedence determines which operations evaluate first when several operators appear together. Three traps appear in placement tests with consistent frequency:

  • *p++ increments the pointer (moves to the next memory address), not the value at p. To increment the value, write (*p)++.
  • Bitwise & has lower precedence than ==. The expression x & mask == 0 parses as x & (mask == 0), almost never what the writer intended. Use parentheses: (x & mask) == 0.
  • In C, && has higher precedence than ||. Unlike some other languages where the two are equal, a || b && c in C parses as a || (b && c).

For worked examples that use these operator rules in placement test format, the C programming interview questions list includes operator-precedence problems with full derivations.

Tokens and the Compiler Pipeline

Understanding token types converts MCQ elimination from guesswork into a rule check. Consider these three lines:

int register = 5;   /* register is a keyword: compile error */
int 2count = 0;     /* starts with digit: compile error */
int max = 'A' + 1;  /* valid: 'A' is 65, max becomes 66 */

In each case, classify every word and symbol by token type. If any token violates the rules for its type, the line fails before the program runs. Scanning options this way rules out one or two MCQ distractors in under ten seconds.

Knowing token categories also clarifies why certain C behaviours feel surprising. When output differs from what the code seems to say, the reason is often operator precedence or a constant type mismatch, both of which reduce to token classification at their root.

Where C Tokens and AI Tokens Meet

The token concept in C did not stay inside compilers. Large language models also process text as tokens, though the definition shifts: an LLM token is a sub-word chunk, and the tokenizer maps raw text to vocabulary IDs the same way a C lexer maps source characters to typed tokens. Every LLM API prices calls by token count for this reason, and inference speed scales with sequence length in tokens, not characters.

TinkerLLM lets you run live LLM API calls and inspect the tokenizer output directly. At ₹299, it puts actual token-count data in your hands: type a C code snippet into a prompt and watch how the tokenizer splits keywords, identifiers, and operators into its own vocabulary units. It is a concrete way to see that the lexical analysis idea C introduced in the 1970s still drives how the most powerful AI models read text today.

Primary sources

Frequently asked questions

How many keywords are in ANSI C?

ANSI C (C89) defines 32 reserved keywords. C99 added 5 more: restrict, inline, _Bool, _Complex, and _Imaginary. Most placement test MCQs reference the 32 ANSI C keywords, so that count is the one to memorise.

Can a C identifier start with an underscore?

Yes, identifiers may start with an underscore. However, identifiers beginning with an underscore followed by an uppercase letter or another underscore are reserved for the standard library. Avoid that pattern in your own code to prevent conflicts.

What is the difference between a constant and a variable in C?

A variable holds a value that can change during execution. A constant defined with const or #define holds a fixed value; the compiler prevents any code from modifying it after definition.

What is the difference between a string literal and a character constant in C?

A character constant is a single character in single quotes, like 'A', stored as its ASCII integer value. A string literal is a sequence of characters in double quotes, like "FACE", stored as a null-terminated char array with one extra byte for the '\0' terminator.

What are the types of C operators by arity?

Unary operators act on one operand (++, --, !, ~, sizeof). Binary operators act on two operands and include arithmetic, relational, logical, bitwise, and assignment sub-types. The ternary operator ?: acts on three operands and is the only one of its kind in C.

Is 'int' a keyword or an identifier in C?

int is a keyword, one of the 32 reserved words in ANSI C. It cannot be used as a variable name, function name, or any other identifier. Note that INT and Int are valid identifiers because C keywords are all lowercase.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF