"List" Is a Four-Letter Word

By: Jeff "japhy" Pinyan

Email comments to japhy@pobox.com
Much confusion can arise from the concepts of lists and arrays in Perl. Add the concepts of scalar context and list context, and one can become thoroughly befuddled. Now you can learn how to properly wield these words, and use lists effectively.

A Word on Context

In Perl, the word "context" refers to the settings that an expression is in. There are three contexts, void, scalar, and list contexts, of which only the last two are relevant to the content of this article.
Rick Delaney suggested an analogy of "context" be made to another language, such as English. This analogy will come in handy in the text to follow. "Rules" -- alone, without context, that word is meaningless. However, "To understand Perl, you must understand its rules," and "Perl rules!" show the difference in context; in one, "rules" is a noun, and in the other, it is a verb.

Scalar

Scalar context is invoked when Perl is expecting to return a single value. The following situations invoke scalar context on the expressions in bold.
  $scalar = $otherscalar;
  $scalar = @array;
  $scalar = (EXPR1, EXPR2, EXPR3);
  $scalar = subroutine();
  $scalar = <FH>;
  $scalar = scalar(@array);
  my $scalar = $otherscalar;
  my $scalar = @array;
  my $scalar = (EXPR1, EXPR2, EXPR3);
  my $scalar = subroutine();
  my $scalar = <FH>;
  my $scalar = scalar(@array);
  if ($scalar) { ... }                  # or elsif or unless
  if (@array) { ... }
  if (EXPR1, EXPR2, EXPR3) { ... }
  if (subroutine()) { ... }
  while ($scalar) { ... }               # or until
  while (@array) { ... }
  while (EXPR1, EXPR2, EXPR3) { ... }
  while (subroutine()) { ... }
  $scalar = $string =~ /regex/;
  my $scalar = $string =~ /regex/;
  $array[0] = @otherarray;
  $array[0] = <FH>;
  $hash{key} = $scalar;
  $scalar + @array;
Note: EXPRn refers to any Perl expression

Whenever a value is tested for falsehood or truth, it is evaluated in scalar context.

Arrays return their length -- the number of elements they hold -- when called in scalar context. Comma-separated series of expressions, such as ('a', 'b', 'c') # or $x++, abs $foo, @a; in scalar context are discussed below, in the section discussing the comma operator. Hashes return a string that represents the number of "buckets" that are being used, out of how many are allocated. Functions called in scalar context apply the scalar context to their return values.

List

List context is invoked when Perl is expecting to return any number of values, whether 0, 1, or more. The following situations invoke list context on the expressions in bold.
  ($scalar) = $otherscalar;
  ($scalar) = @array;
  ($scalar) = (EXPR1, EXPR2, EXPR3);
  ($scalar) = subroutine();
  ($scalar) = <FH>;
  my ($scalar) = $otherscalar;
  my ($scalar) = @array;
  my ($scalar) = (EXPR1, EXPR2, EXPR3);
  my ($scalar) = subroutine();
  my ($scalar) = <FH>;
  for ($scalar) { ... }
  for (@array) { ... }
  for (EXPR1, EXPR2, EXPR3) { ... }
  for (subroutine()) { ... }
  print $scalar;
  print @array;
  print EXPR1, EXPR2, EXPR3;
  print subroutine();
  ($scalar) = $string =~ /regex/;
  my ($scalar) = $string =~ /regex/;
  @array = @otherarray;
  @array = $scalar;
  @array[0] = @otherarray;
  @array = <FH>;
  @array[0] = <FH>;
  @hash{key1,key2} = ($scalar1,$scalar2);
  %hash = qw( key value key2 value2 );
  push @array, function();
If you are confused by the difference between $array[0] and @array[0], don't worry. You will soon learn the definition of the term slice, and you will learn that @array[0] is syntactically equal to ($array[0]). As you can see, parentheses around a scalar or group of scalars on the left-hand side of the assignment operator invoke list context.

Arrays in list context return their elements in order. Comma-separated series of expressions in list context return their values in order as well. Hashes in list context return a seemingly unordered list of their key-value pairs. Functions in list context will invoke list context on their return values.

The Comma Operator

Note: there is no such thing as a list in scalar context. When this appears to be the case, the comma operator is put to work.

When Perl sees a comma-separated series of expressions inside parentheses, in scalar context, Perl employs the comma operator. This interesting operator evalutes its left-hand operand, discards the results, and returns its right-hand operand. Thus, only the final expression is returned for use.

Notice how parentheses play a very important role when dealing with lists and comma-separated expressions: $scalar = ('a', 'b', 'c'); # scalar; $scalar = 'c' ($scalar) = ('a', 'b', 'c'); # list; $scalar = 'a' $scalar = 'a', 'b', 'c'; # scalar; $scalar = 'a' @array = ('a', 'b', 'c'); # list; @array = ('a', 'b', 'c') @array = 'a', 'b', 'c'; # scalar; @array = ('a') In the first case, we can tell that scalar context is being applied. We know how the comma operator works -- thus, the value 'c' is returned, and $scalar is set equal to it. Look below, regarding case five, for a warning message you would get if you use the -w switch to perl.

In the second case, there is list context, and each value on the right is assigned to the variable in the same position on the left-hand side: ($a,$b,$c) = ('alpha','beta','gamma'); sets $a to 'alpha', $b to 'beta', and $c to 'gamma'. In this case, though, only the first value of the list is saved to a variable, and the others are thrown out. It is important to know, though, although they are "thrown out", they are still evaluated. Each of the three variables here gets incremented, although only the last one is saved to another variable: $incremented_z = (++$x, ++$y, ++$z); In the third case, the assignment operator (=) binds more tightly than the comma operator, and you end up with a series of expressions, namely: $scalar = 'a', and 'b', and 'c'. See below, regarding case five, about a warning message you'll get if you use the -w switch to perl.

In the fourth case, the array on the left hand side calls for list context, and so the array is cleared, and is given the values of the list as its elements. It is important to realize that: @array = ('a', 'b', 'c'); and ($array[0], $array[1], $array[2]) = ('a', 'b', 'c'); do not do the same thing. The first clears the array, and gives it three elements with the given values; the second changes the values of the first three elements of the array, and leaves the rest alone.

In the fifth case, we would get a warning from Perl if we were using the -w switch to perl. This would also happen in the first and third examples. @array = "MIDN", "4/C", "PINYAN"; would make Perl alert you with the messages: Useless use of a constant in void context at program.pl line 3. Useless use of a constant in void context at program.pl line 3. This is because the assignment operator doesn't care you're assigning to an array, and it has greater precedence than the comma operator. Here, @array would be cleared, and its first element would be given the value "MIDN", and the other two values would spawn the warning messages shown above.

Lists vs. Arrays

Now that you've been thoroughly bombarded with lists and arrays and contexts, it's time we nailed down the difference between an array and a list. It's all in the name -- literally. An array is a list with a name. Because arrays and lists differ in this way, Perl treats them differently (good thing). The major differences between the two are: As for their similarities:

Slices

Earlier, it was explained that @array[0] was a slice, and was not exactly the same as $array[0]. An array slice is indicated by a leading @ on the array name, followed by one or more expressions in brackets: @array[0,1,2] = ($a,$b,$c); An array slice is a shorthand format for referring to a list made up of individual elements of the array: ($array[0], $array[1], $array[2]) = ($a,$b,$c); Thus, the following lines do two very different things: $line[0] = <FILE>; @line[0] = <FILE>; The <FH> operator returns a single line in scalar context, and a list of all the lines from its current position to the end of the file in list context. The second line can be rewritten as: ($line[0]) = <FILE>; Which calls the <FH> operator in list context, which means the entire contents of FILE are read, and the first value returned is stored in $line[0], while all others are discarded -- their values not used.

Slices work on list as well as arrays. Instead of writing something as hideous as: sub year { my ($s,$m,$h,$D,$M,$Y,$wday,$yday,$tz) = localtime; return $Y + 1900; } one could simply use: sub year { my $Y = (localtime)[5]; return $Y + 1900; } The localtime() function returns a string when called in scalar context, but a list of values in list context. Only the element with index 5 is of any interest to us. To take a slice from the values returned by a function, the function call must be placed in parentheses, and then the subscript follows outside the parentheses, as is shown above. List slices of one element are no different in syntax than list slices of multiple elements because there is no leading symbol for a list: $scalar = (localtime)[5]; # gets year $scalar = (localtime)[0,3,5]; # still gets year ($day,$year) = (localtime)[3,5]; # $day gets day, $year gets year The second line demonstrates what happens when you have a slice in scalar context. It can be explained with the following expansion: $foo = ('a', 'b', 'c')[0,2]; $foo = ('a', 'c'); $foo = 'c'; With arrays, the leading symbol changes when doing list slices, or fetching individual elements: $array[0] = "foo"; # first element is set to "foo" $array[0,4] = "foo"; # fifth element is set to "foo" @array[0,4] = ("a","b"); # first is "a", fifth is "b" @array[0,4] = "a", "b"; # first is "a", fifth is undef! Be careful to remember your parentheses, as shown by the last line of the example!

Slices can be done on any list, not just one returned from a function: $day = ("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat")[$num]; @primes = (1, 3, 5, 7, 9)[1..3]; The .. operator here, shown in list context, is merely a range operator, and saves us from printing all the numbers in a numerical range.

An understandable explanation of why the leading symbol changes is that "the symbol tells you what you're getting back, not what you're working with". A leading @ indicates a list being returned, while a leading $ indicates a scalar value. That is why hash slices look like @hash{key1, key2} = ("value1", "value2"); While we are discussing hashes, it is important to know that Perl supported multi-dimensional hashes using the syntax: $hash{level1, level2, level3} = "value"; For backward compatability, a comma-separated series of expressions inside a hash subscript for a scalar does NOT employ the comma operator -- instead, the expressions are converted to a scalar via join($;, LIST), where $; is the subscript separator variable, and LIST is the series of expressions (that becomes "keys"). Please note that $hash{@array} evaluates @array in scalar context, it does not expand it to its values. This is consistent with the rules of arrays.

Return Values

The context a subroutine is called in is applied to the values it is to return. Take these simple subroutines: sub A { return ('a', 'b', 'c'); } sub B { my @array = ('a', 'b', 'c'); return @array; } sub C { my @array = ('a', 'b', 'c'); return @array[0..$#array]; } With these three functions, there is no way, just by looking at them, of being sure what they are going to return. A() might return three scalar values, or if it is called in scalar context, the comma operator will act on the series of comma-separated values, ('a', 'b', 'c'), and only 'c' will be returned.

The $#array variable refers to the last index of @array, which is documented in perldata. Let us now examine the return values of these functions: $a = A(); ($b) = A(); $c = B(); ($d) = B(); $e = C(); ($f) = C(); The A() function, in scalar context, has the comma operator act on the comma separated series of values, and so $a is set to 'a' which makes sense. $b, however, gets the value 'c', because it invokes list context on the function, which returns a list.

The B() function returns an array. Thus, $c invokes scalar context on an array, and thus gets the value 3, the length of the array. We know, then, that $d gets the value 'a' because it invokes list context.

The final pair is not as difficult as it may seem to be -- the function returns an array slice. Thus, C() behaves exactly like A().

Anonymous Array

This next section assumes you have some knowledge of references in Perl, specifically array references.

When creating a reference to an anonymous array, the question is raised as to whether an array without a name is a list. Remember, an array is a named list, so shouldn't an unnamed array be a list?

"N-n-not exactly."

Just because an array is anonymous does not mean it is stripped of its properties: $aref = [ qw( an array here ) ]; push @$aref, "not", "a", "list"; Clearly, @$aref is the array referenced by $aref, which gives it all it needs, in terms of a "name". Anonymous array references are as much arrays as regular arrays. Think if an anonymous array as having a name that only Perl can pronounce. :)

Resources

Many functions in Perl act differently when called in list context from when they are called in scalar context. Read the perlfunc documentation on the specific function you are interested in. This goes for operators as well, like pattern matching. Read perlop. Some examples: ($day,$month,$year) = (localtime)[3..5]; $date_string = localtime; $found = $string =~ /Jeff/; @integers = $string =~ /(\d+)/g; If you really want to know about old multi-dimensional hashes, read the perlvar documentation for the $; variable, and its use.

The perldata documentation discusses arrays and lists and contexts, as well as gives a formal introduction of arrays.