C++: Visibility and Look-up
This article deals with some situations where entities eclipse other entities with the same name. It starts with a review of some standard C++ mechanisms then presents some problems that arise when combining them. It's not for beginners - unless you're serious about C++ you don't need to worry about meeting these problems or trying to understand them.
Scope and Visibility
Consider this little program.
int main() {
int i;
i=9;
}
There are two is - one inside main and one outside. Which i is set to 9? By default the "nearest" one is, the one inside main (we'll come back to what is meant by "nearest" later in this article). The other i isn't visible (it's covered up by main's i) but it's "in scope" so it can be accessed. C++'s scope resolution operator (:: ) is used when specifying where to look for a variable in such circumstances. In this case the scope to be searched is the one that encloses the current one. Using ::i accesses the variable there.
Namespaces
A namespace defines a scope. It's like a context which determines the meaning of a symbol. Just as the meaning of "Cambridge" will change depending on whether you're in a UK or US context, so the variable accessed in a C++ program by the symbol "i" will depend on the context, as we've seen above.
Some simple languages only have one namespace - all function names, variable names, etc belong to the same context. Some other languages have several independent namespaces (one for variable names, one for function names, etc) making it possible to have both a variable and function with the same name, but the number and role of these namespaces are fixed.
C++ has some fixed namespaces, but it also has named namespaces and lets users create new namespaces. It also offers control over which of these named namespaces will be used when the meaning of a symbol is required.
In ANSI C++ the standard library facilities (like cout, string, etc) are kept inside the namespace called std, which by default isn't consulted. Namespaces are created and entities put into them by using namespace.
E.g.
int i;
};
creates a namespace called test (if one hasn't already been created) and puts i into it. Then test::i (the same notation that you'd use were test an object) will access the i variable. The command using namespace test will make available all the things inside test so that test:: isn't necessary. Let's see this in action
int i;
};
int i;
int main() {
i=9;
}
In this program main can only see one i, the other is hidden inside the test namespace. The latter is in scope and can be accessed using test::i. What about the following though?
int i;
};
using namespace test; //this line's been added
int i;
int main() {
i=9;
}
Here both is are visible from main. In fact they clash so this program won't compile. My compiler says The declarations "int i" and "int test::i" are both visible and neither is preferred under the name lookup rules.
Classes
Functions can be in a class or free standing. Whenever a function call is processed there may be several available functions in scope with the same name. C++ performs a look-up using a well-defined strategy in order to decide which function to call. Sometimes it can't decide which function is best to call, in which case the compiler complains about an "ambiguous call". Usually there are no problems - the programmer and compiler agree on the best option. In the following for example, there's a free-standing fun(float f) as well as one in the class. The fun(f) call in main calls the free-standing one. The call from inside the class calls the "nearer" one inside the class.
class classy {
public:
void fun(float f){};
void fun2(float f){fun(f);};
};
int main() {
float f;
fun(f);
classy c;
c.fun2(f);
}
Functions and Overloading
Functions add a complication because it's possible to have many functions with the same name all visible without clashing as long as they take different arguments. In the following example fun is overloaded - which isn't a problem!
void fun(int i, int j) {};
int main () {
fun(3);
fun(5,7);
}
The following's perhaps a little trickier.
int main () {
int i=3;
fun(i);
}
There's no function called fun that takes an integer so fun(float f) is called without complaint. In the next example fun(int) is supplied, so this will be the preferred candidate.
void fun(int f) {}; // added line
int main () {
int i=3;
fun(i);
}
None of that should be too disturbing, but what about the following? Which f function is called? The first might look like the closest match, but it's the second that's called, because the char* to bool conversion is built into C/C++, and matches using standard conversions take precedence over user-defined ones.
#include <string>
using namespace std;
void f(string a, string b, bool c = false) {
cout << "called 3 arg function" << endl;
};
void f(string a, bool c = false) {
cout << "called 2 arg function" << endl;
};
int main()
{
f("one", "two");
}
Inheritance
Often you'll need to add extra functionality to an existing class. C++ provides a mechanism to build new classes from old ones
public:
int value1;
};
class More : public Base {
public:
int value2;
};
int main() {
Base b;
b.value1=7;
More m;
m.value1=7;
m.value2=9;
}
Here More inherits the members of Base so m has 2 members - value1 and value2. Members can be functions or variables. The following, which uses functions where the previous example used variables, works ok.
public:
void fun1(){};
};
class More : public Base {
public:
void fun2(){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun2();
}
Now we come to our first "interesting program". Suppose we give both functions the same name but different arguments. What happens?
public:
void fun1(){};
};
class More : public Base {
public:
void fun1(int i){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun1(5);
}
b.fun1() poses no problem. One might expect m.fun1() to call the Base's function and m.fun1(5) to call More's function (i.e. expect fun1 to be overloaded). In fact the code doesn't compile - void fun1() is masked by void fun1(int). With an extra line it will compile
public:
void fun1(){};
};
class More : public Base {
public:
using Base::fun1; // added line
void fun1(int i){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun1(5);
}
Function Lookup
And here's another surprising situation. The following compiles, but why?
class T {};
void f(T){};
};
test::T parm;
int main() {
f(parm); // OK: calls test::f
}
Here we have a namespace called test inside which there's a class T and a function that takes one argument of type T. Outside the namespace a variable called parm is created. Note that the test:: is needed to get hold of the T within this namespace. In main a function f is called. Even though there's no test:: before the function name, and no previous using namespace test line, the program compiles.
This is a situation where Koenig lookup (also called Argument-Dependent name Lookup - ADL) is used. If you supply a function argument that isn't a built-in type (here parm, of type test::T), then to find the function name the compiler is required to look in the namespace (in this case test) that contains the argument's type as well as in the usual places.
Ordinary name look-up searches for qualified names in the nearest enclosing scope where the name is used, and if not found, the look-up proceeds in successively enclosing scope until the name is found. Even if the name is not appropriate for the given use, the look-up search proceeds no further through the hierarchies. At this point ADL finishes the job.
This explains why the example in the previous section failed whereas the one in this section succeeded, but the look-up mechanism seems to be defeating one of the purposes of namespaces - the ability to hide entities. However, there's a case for saying that once T is brought out into the open, then associated routines should become visible too. There are also pragmatic and safety reasons why ADL is used.
- Here's a simple program
#include <iostream>
#include <string>int main() {
std::string hello = "Hello, world";
std::cout << hello;
}This is analogous to the previous program: std::string is like the test::T of the earlier example. operator<< is a free function that the compiler can only find using ADL (operator<< can't be a member function because it requires a stream as the left-hand argument). Without ADL the final line would be awkward to express.
- Here's another simple fragment
char x;
void f() {
int x;
x = 'a';
}C/C++ has always set the function's x in this situation although the other x is a closer match type-wise. ADL conforms with this traditional behaviour.
Here's a situation involving classes. In this fragment, the g function calls the class's f routine.class X {
int f(int);
int g() { f('a'); }
}But suppose that during program development a global function f(char) were added - what f function should g call then? It would be an unpleasant shock if the global function were called - you don't want the internals of classes to be quite so vulnerable to external changes.
A final note from Victor Bazarov on comp.lang.c++ - ADL applies only to function names, not variables. The only other thing that has arguments in C++ is templates. But ADL doesn't apply to them. In this example
enum foo { f };
template<foo f> class bar {};
}
int main() {
bar<test::f> barf;
}
main's bar isn't going to be looked up in test even though its argument is fully qualified and found in the test namespace.