I am in the same boat as you are...bare with me by no means I am an expert.
In fact, I read couple of chapters in Modern C yesterday :). Here are some of
the things I am doing to improve my C skills to match with some of the
admired/professional developers.
Decide which platform to use
~~~~~~~~~~~~~~~~~~
Unfortunately, to become proficient in it we need to write code and focus on a
platform. I have been fighting between whether to develop on Windows vs Linux. I
am very experienced in Windows environment(using debuggers/cl/linkers/Windbg
etc) but when it comes to writing good quality C code(not C++) and for learning
how to write good maintainable moderately large source code, my research showed
that Windows compilers/C standard APIs are not great, in fact they hinder your
productivity. I have wasted countless number of hours to just figure out how to
properly create a simple C project with a decent build system. Unfortunately, I
could not find one. The closest I could find is CMake as MSBuild is a nightmare
to work with. I even tried NMAKE but failed. When it comes to documentation of
finding basic C Api usage,
MSDN(https://docs.microsoft.com/en-us/cpp/c-runtime-library/run-t...)
does a decent job. But in the name of security you will find zillion
variations(_s, _l) of an API and by default msvc compiler will not let you use
some of the API in their simple forms. Instead, you have to define
_CRT_SECURE_NO_WARNINGS etc. I think for someone just getting started to
develop/learn to write a decent code base in C these restrictions really
hinder the productivity. So finally, I have decided to instead focus my learning
on Linux platform(currently through WSL - Windows subsystem for Linux) with its
POSIX apis. You know what, `man 3 printf` or `man 3 strlen` is soooooo much
better than googling msdn
Mastering C
~~~~~~~
I think, the simple and straight answer here is reading good code and writing
"good" code and also reading good C content(be it books or articles). I think
these are the three ingredients necessary to get started. Of all the open source
projects that I have investigated, I found Linux Kernel and related projects
seems to have very good taste in terms of code quality. Currently, I am just
focused how they use the language rather than what they actually do in the
project. Things like, how they structure the project, how they name things, how
they use types, how they create structures, how they pass structures to
function, how they use light weight object based constructs, how they handle
errors in function(for example forward only goto exits), how they use
signed/unsigned variables etc(more of my learnings to the end), how they use
their own data structures. I think its good to initially focus/target on ANSI C
API with C99 instead of heavily relying on the OS specific API on which ever
platform you choose. For example, such projects could be writing binary file
parsers for example projects like .ISO file format etc.
My Learnings(know your language/know your complier/know your tools)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. _t suffix is the notion used to denote a typedef for a given type
/* This example is from do_journal.c in e2fsprogs /
struct journal_transaction_s {
...
blk64_t start, end;
...
};
typedef struct journal_transaction_s journal_transaction_t;
2. Know about headers like stddef.h and stdint.h and when they are supposed to
be used. for example: When to use normal data types like int vs int16_t etc.
3. From https://en.cppreference.com/w/c/types/integer we can sense that int_t
are of exact width types which might have some perf side effects if the
underlying hardware does not support the width natively.
For example, in visual studio x86/x64 we have typedef short int16_t; and
typedef int int32_t; int_fast_t* on the other hand make sure a suitable
width which maps natively to the available hardware type For example, in
visual studio x86/x64 typedef int int_fast16_t; instead of typedef short
int_fast16_t; size_t on the other hand alias to the natural unsigned word
length of the hardware for example on x86 typedef unsigned int size_t; and
on x64 typedef unsigned __int64 size_t;
4. Know your compiler predefined standard macros
On Microsoft compiler
_WIN64 - defined when we are compiling for x64 code
_WIN32 - defined when both x86 and x64 code is getting compiled
_MSC_VER - defines which compiler version are we using, indicate different visual studio versions
__cplusplus - defined when the translation unit is compiled as C++
5. We can get a FILE* from HANDLE using below APIs from io.h and fcntl.h
Fd = _open_osfhandle((intptr_t)Handle, _O_TEXT);
File = _wfdopen(Fd, L"r"); Once we get the FILE* we can use fgets for line
oriented string operations
6. Learned about var args and aligned memory
Aligned memory means the address returned by the _aligned_malloc is always
divisible by the alignment we specify.
For example: char p = _aligned_malloc(10, 4); the address return in p will
be always be divisible by 4. We should also free the allocated memory using
_aligned_free(p)
7. atoi(str) this api also processes input string until it can convert. For
example atoi("123asda") will still give 123 as the return result. Any
whitespace in the beginning of the input string will be ignored. So atoi("
123asd") will still return 123. It is recommended to use strto functions to
convert strings to int/long/float types as they also can return pointer to
the character which is a non integer
8. UCRT support around 40 POSIX system level APIs but most of these have _
prefix to them. wimlib in wimlib_tchar.h defines #define topen _open for
Win32 and #define topen open for POSIX systems The take away here is the
UCRT implementation even though differ in name the parameters are exactly
the same.
For example:
UCRT Win32: int _open(const char *filename, int oflag, int pmode);
POSIX: int open(const char *pathname, int flags, mode_t mode);
10. Best video on C Standard and some of its less known features - "New" Features in C - Dan Saks
Year C Standard Comments
1983 C standard committee is formed
1989 C89 C89 US standard
1990 C90 C89 International Standard
1999 C99 C99 Standard
2011 C11 C11 Standard
2018 C18 C18 Bugfix release
_reserved - Reserved for global scope. But we can use any identifier with an
_ as a local variable or a structure member
__reserved - Always reserved. Meaning the user program should not use any
variable with two underscores __
_Reserved - Always reserved. Meaning the user program should not use any
variable with an underscore and capital letter.
This is the reason why _Bool is named that way to prevent breaking existing
bool typedef used in existing code.
11. Another good video on lesser known C features - Choosing the Right Integer
Types in C and C++ - Dan Saks - code::dive 2018
we can use CHAR_BIT from limits.h instead of 8 for example when you want to
print the bits in a integer, we can do below `for (size_t i = sizeof(int) *
CHAR_BIT; i >= 0; i--) {...}`
12. size_t denotes the native architecture supported natural word size. So for
32bit it is 4 bytes unsigned quantity and for 64bit it is 8 bytes unsigned
quantity. Hence it is defined as follows
#ifdef _WIN64
typedef unsigned __int64 size_t; //8 bytes on x64
#else
typedef unsigned int size_t; //4 bytes on x86
#endif
where as uintmax_t denotes the maximum integer type that is available in the
language. So on a 32bit you could still represent a 64 bit quantity using
long long even though it not what the architecture directly maps to. So
below is how it is defined in both x86 and x64
typedef unsigned long long uintmax_t; //in MSVC both x86 and x64 support 64
bit quantities using long long. So size_t does not give us the maximum
unsigned integer, instead it gives us the native unsigned integer i.e., on
x86 it will be 32bits and on x64 it is 64bits. So recommendation is to use
size_t where ever possible instead of using int. for example.
int len = strlen(str); // not recommended because on both x86 and x64 of MSVC int is 4 bytes due to LLP64
size_t len = strlen(str); // recommended because size_t will automatically maps to 4 bytes in x86 and 8 bytes in x64
13. C11 introduced the concept of static asserts. These are basically
conditional asserts which can be evaluated during compile time. So C11 has a
new keyword called _Static_assert(expr, message) The reason for this ugly
name is the same idea of not to break existing code. so for convenience
assert.h header provides static_assert macro which mean the same. One of the
use of static asserts is below
14. Another good video on some low level details - Storage Duration and Linkage
in C and C++ - Dan Saks
15. #define _CRT_SECURE_NO_WARNINGS can be used to disable CRT warning for common functions.
16. Any ucrt function which begins with _ is a non standard api provided by
ucrt. For example in string.h's _strdup, _strlwr, _strrev are some. The take
away here is, it is easy to identify which function is part of C standard
and which are not. Interestingly some(not all) of these non standard
functions are part of posix so in glibc(which implements posix) don't have _
in them.
17. All posix function in posix standard with [CX] annotation indicate Extension
to the ISO C standard for example, below function from stdlib.h is posix
extension. UCRT defines a similar api called _putenv, since this is not part
of C standard, UCRT version has an _
stdlib.h - posix
[CX] int setenv(const char *, const char *, int);
stdlib.h - ucrt
int _putenv( const char *envstring );
stdio.h - posix
[CX] int fileno(FILE *);
stdio.h - ucrt
int _fileno( FILE *stream );
18. Learned about CGold: The Hitchhiker’s Guide to the CMake. An awesome
tutorial about CMake. Now it is super easy to start a C project without
worrying about the individual build systems.
# CMakeLists.txt - minimum content
cmake_minimum_required(VERSION 3.4)
project(command_line_parser)
add_executable(command_line_parser main.c)
# commands to run to generate the respective native build files like vcxproj files
# In below command -S standards for source directory path.
# -B stands for final directory where vcxproj files are generated
# CMake only generate one flavor (x64/x86) per project file, here we are generating x64 by specifying x64
cmake -S . -B builds -G "Visual Studio 16 2019" -A x64
# we can also use cmake-gui to do the above
# Once vcxproj files are generated we can either directly build the proj files using Visual Studio
# or better use cmake itself to build it for us from CMD using msbuild
cmake --build builds
> _t suffix is the notion used to denote a typedef for a given type /* This example is from do_journal.c in e2fsprogs / struct journal_transaction_s { ... blk64_t start, end; ... }; typedef struct journal_transaction_s journal_transaction_t;
Speaking of the _t suffix. I think it's being largely abused: we say 'size_t' (instead of 'size') because it is not obvious whether 'size' is a type or not; on the other hand, in the case of, say, 'int32_t' it is clearly redundant and therefore has always looked kinda silly to me.
In fact, I read couple of chapters in Modern C yesterday :). Here are some of the things I am doing to improve my C skills to match with some of the admired/professional developers.
Decide which platform to use
~~~~~~~~~~~~~~~~~~
Unfortunately, to become proficient in it we need to write code and focus on a platform. I have been fighting between whether to develop on Windows vs Linux. I am very experienced in Windows environment(using debuggers/cl/linkers/Windbg etc) but when it comes to writing good quality C code(not C++) and for learning how to write good maintainable moderately large source code, my research showed that Windows compilers/C standard APIs are not great, in fact they hinder your productivity. I have wasted countless number of hours to just figure out how to properly create a simple C project with a decent build system. Unfortunately, I could not find one. The closest I could find is CMake as MSBuild is a nightmare to work with. I even tried NMAKE but failed. When it comes to documentation of finding basic C Api usage, MSDN(https://docs.microsoft.com/en-us/cpp/c-runtime-library/run-t...) does a decent job. But in the name of security you will find zillion variations(_s, _l) of an API and by default msvc compiler will not let you use some of the API in their simple forms. Instead, you have to define _CRT_SECURE_NO_WARNINGS etc. I think for someone just getting started to develop/learn to write a decent code base in C these restrictions really hinder the productivity. So finally, I have decided to instead focus my learning on Linux platform(currently through WSL - Windows subsystem for Linux) with its POSIX apis. You know what, `man 3 printf` or `man 3 strlen` is soooooo much better than googling msdn
Mastering C
~~~~~~~
I think, the simple and straight answer here is reading good code and writing "good" code and also reading good C content(be it books or articles). I think these are the three ingredients necessary to get started. Of all the open source projects that I have investigated, I found Linux Kernel and related projects seems to have very good taste in terms of code quality. Currently, I am just focused how they use the language rather than what they actually do in the project. Things like, how they structure the project, how they name things, how they use types, how they create structures, how they pass structures to function, how they use light weight object based constructs, how they handle errors in function(for example forward only goto exits), how they use signed/unsigned variables etc(more of my learnings to the end), how they use their own data structures. I think its good to initially focus/target on ANSI C API with C99 instead of heavily relying on the OS specific API on which ever platform you choose. For example, such projects could be writing binary file parsers for example projects like .ISO file format etc.
Good C projects/articles
~~~~~~~~~~~~~~~
1. Winlib.net - https://github.com/jcpowermac/wimlib is a great source of information
2. e2fsprogs - https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/
3. MUSL - https://git.musl-libc.org/cgit/musl/tree/
4. General C Coding Style - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
4. https://nullprogram.com/tags/c/ - great source of C knowledge
5. CCAN - https://github.com/rustyrussell/ccan/tree/master/ccan - great source of C tidbits from none other than Rusty Russell - I haven't read all of them
6. POSIX 2018 standard - https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...
continued in the comment....