November 10, 2023

Let's learn USB! -- A C++ primer

Let's learn C++!

No, not really -- but let's learn at least enough to be able to read and understand the papoon source code. I find that difficult to do, even with a good understanding of C. I have a sense that the regbits scheme that the papoon author invented is at least part of the issue. It is essentially the invention of a unique on of a kind "dialect" or domain specific language for C++ hardware access that we find ourselves forced to deal with.

There is absolutely no required connection between C++ and USB. So, if you just want to skip this page, you should feel free to do so. However, we are trying to learn some USB tricks by studying papoon. And papoon is some 7300 lines of C++ code, so for an old dyed in the wool C programmer like myself, learning some C++ seems unavoidable.

Bear in mind reading this that I am writing this as a person who is very experienced with C but entirely new to C++. I may well misunderstand things that are new to me and fail to properly explain them. If you discover such shortcomings, send me an email and I will fix them.

Books

I have two C++ books. One sucks, the other seems pretty good:
C++ for C Programmers by Ira Pohl, 1994 (sucks)
Essential C++ by Stanley Lippman, 2000 (good!)
The title of the first book makes it sound perfect. But the author is out to "sell" C++ as a better language than C and makes it his crusade throughout the book. 370 pages of this. You would think the book would be short since he doesn't need to spend time explaining C language basics, but it doesn't work out that way.

Amazingly enough the second book (by Lippman) is only 276 pages and much better written. He doesn't try to make C++ the new religion that will save the world. If you want more of him, he also has a 1237 page "C++ Primer" that comes with recommendations.

I need to say this. I have spent the better part of two days working this up, so don't expect to just read through it and absorb it in a few minutes. The human mind can only learn so much new stuff in a given time and needs some rest between intervals of intense learning. I have tried to make this as compact and information dense as possible.

Also, you will gain much much more if you try to write and run some C++ code yourself. It is easy! It is fun! And every error message will teach you something. You may think you understand something, but the compiler will show you that you don't.

File names

So, is a C++ file x.cpp or x.cxx. These days either one is fine and the compiler will recognize either convention. What about .hxx files (perhaps in some places there are .hpp files?)? These are fine. In fact .h files are just fine. For that matter you can include spam.xyz and the compiler would (or should) be happy to do it. It makes some people happy to name their header files .hxx, and if so, then fine for them.

In fact papoon ships with .hxx header files, that I soon intend to rename to .h headers as a sort of experiment. For one thing, it will make it easier to look at a directory listing and see at a glance what files are .cxx and which are .h. Indeed, it works just fine.

Reading skills

We don't particularly want to become C++ programmers, but we do want to be able to read and understand the papoon code. We can get a fair ways just squinting, ignoring details and getting a general idea of what is going on, but at some point details become important. I'll also note that the papoon author invents a sort of C++ dialect we need to learn with his "regbits" scheme. We may need to devote a page of its own to that -- we shall see.

What I am going to try to do is to make an outline of the Lippman book, going through it in order, skipping everything I possibly can.

IO and printf

To understand papoon we can skip C++ file IO entirely. As a module intended to be part of an embedded program, papoon has no dependencies on the standard C++ libraries, which is very nice. I use printf to generate console messages by using this prototype:
extern "C" void printf ( const char *, ... );
That does the trick and allows me to call the printf I have implemented in C code from C++.

Running C++

I want to at least experiment with short C++ programs as part of my education. I can invoke either g++ or c++ as follows:
g++ -o fish fish.cxx
The usual "hello world" program will look like this:
#include <iostream>
using namespace std;

int
main ()
{
    cout << "Hello World";
    cout << endl;
}
Note however that we are using the cout facility from the system library "iostream", which is something we said we were going to avoid.

Also note that C++ here has somehow overridden the C left shift operator for another purpose. This is getting way way ahead of ourselves, but the proper lingo might be "overloaded" rather than "overridden" and how this gets done involves (perhaps) friend classes and C++ trickery that is way beyond me now, just ignore all this for the time being. Whoever wrote the iostream class decided to do things this way and knew all the tricks.

I find that I can accomplish the same thing like so:

#include <cstdio>

int
main ()
{
    printf ( "Hello World\n" );
}
The C++ purist would probably advocate for the use of the iostream method.

Note also that these include statements don't have a ".h" at the end of the files. There is no magic going on. C++ includes exactly the file you specify. The surprise is that the C++ standard header files don't have any ".h" extension! In fact "string" and "string.h" are two very different things.

So we can learn a lot just playing with the most basic and simple program. The C++ world is very different, from the bottom up.

Strings and other types

C++ offers a "string" class as part of the standard library. This is a different approach than taken by C, although you can still define C style strings which are arrays of char types (8 bit gadgets). Papoon doesn't use the string class. It does include stdint.h, which is provided by the compiler. It gives us things like uint32_t, uint8_t and so forth, presumably via typedefs that promote code portability.

C++ adds a "bool" type with the values true and false to the set of principal types.

Arrays and Vectors

C++ has arrays, just like C -- nothing new there that I can see. But C++ adds a vector class that some folks recommend in lieu of arrays. The benefit is that vectors work in conjunction with type templates. I have a whole page dedicated to those at the end of this C++ primer. As a preview of how this looks consider:
int joe[22];
vector<int> joe ( 22 );
For an array of integers, these are more or less equivalent, but when we start wanting to work with vectors of objects, the class name goes inside the angle brackets and we get a vector of objects. More later.

Random good things

A number of things were first introduced in C++ and are now enjoyed as part of the regular C language. Things like "//" for single line comments, the "const" prefix, the "long double" type and so on.

Some people claim that "const" can be used in cases where a #define would have been used in C. I am not convinced, mostly based on the sort of code that will be generated if we interchange these:

#define LIMIT 999
const int limit = 999;
If you don't care about the generated code and want to live in a high level language land of fantasy, perhaps the "const int" construct might be chosen as a prefered style. But I argue that it is a matter of style and preference. It seems to be a style preference among the C++ community to reduced the use of preprocessor macros.

You can initialize an integer variable in either of these ways:

int x = 0;
int x ( 0 );
The second form is new for C++ and while it may seem like an odd option here, it becomes useful when an instance of a class is being initialized, particularly when the class has several members.

Consider the following, this is good, recommendable C++ coding:

const string my_msg ( "Execution halted, terrible error" );

Function call by reference

A prototype and function definition can be written like this:
void pickle ( int &x, int &y );
Without the ampersand, we get the C behavior we are used to. Namely copies of the value of x and y get pushed onto the stack and the function gets called. Inside the function those values are local and can be changed without affecting the outside world.
With the ampersand, the values are passed by reference, so changes inside the function do affect the outside world. The function call and the body of the function look exactly the same in both cases.

This sort of thing is not restricted to function arguments. We can do this:

int value;
int &rval = value;
Now value is the actual thing with a value and rval is a reference to the same thing, not another variable. This becomes more interesting, useful, and important when we begin dealing with objects.

Function call with default parameter values

You can do this:
int func ( int arg = 123 ) { return arg; }
Parameters with defaults must be at the end of the argument list so they can be omitted. The default value can be specified either in the prototype (in a header file) or in the function definition.

Function overloading

C++ allows us to have several functions, all with the same name, that are distinguished by the types and number of the arguments. So we could have:
fish ( int )
fish ( int, int )
fish ( double )
And C++ would be perfectly happy and pick the right one based on what arguments the function was called with.

Template functions

We haven't talked about templates yet (at least not in detail), but we aren't going to let that stop us from talking about template functions. The idea is that if we are passing a type that contains a template to a function, we can generalize the function to work with any type that might be used in the template.
template 
void tfunc ( const vector &vec )
{
    etype t = vec[99];
    ....
}
This could be called by:
vector ivec;
tfunc ( ivec );
vector svec;
tfunc ( svec );

new and delete

What you might do with calls to malloc() and free() in a C program are built into C++. You could do this:
int *ip;
ip = new int ( 999 );
delete ip;
If you wanted an array of 333 integers you would do:
int *ip = new int[333];

Namespaces

When you write code for a library, you generally put it into a namespace. For example papoon uses several namespaces including one named "regbits". Once you have done this, you have a choice. You can reference things in the namespace explicitly using something like std::cout. If typing all this wears you out, you can specify that you are using a namespace by the statement:
using namespace std;
Then you can just type "cout" rather than the laborious "std::cout"

More about "using"

You will find some fancy use of "using" in the papoon files, so we are sort of forced to dig deep into it. Since we have just mentioned it in conjunction with namespaces, we may as well elaborate the whole story. We can use "using" in several ways: Clearly we are getting well ahead of ourselves once again. Don't blame me. Papoon does a bunch of exotic stuff with "using" that still doesn't make sense, even with the above explanations.
Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org