Hacker Newsnew | past | comments | ask | show | jobs | submit | vineel567's commentslogin

I am in the same boat as you are...bare with me by no means I am an expert.

In fact, I read couple of chapters in Modern C yesterday :). Here are some of the things I am doing to improve my C skills to match with some of the admired/professional developers.

Decide which platform to use

~~~~~~~~~~~~~~~~~~

Unfortunately, to become proficient in it we need to write code and focus on a platform. I have been fighting between whether to develop on Windows vs Linux. I am very experienced in Windows environment(using debuggers/cl/linkers/Windbg etc) but when it comes to writing good quality C code(not C++) and for learning how to write good maintainable moderately large source code, my research showed that Windows compilers/C standard APIs are not great, in fact they hinder your productivity. I have wasted countless number of hours to just figure out how to properly create a simple C project with a decent build system. Unfortunately, I could not find one. The closest I could find is CMake as MSBuild is a nightmare to work with. I even tried NMAKE but failed. When it comes to documentation of finding basic C Api usage, MSDN(https://docs.microsoft.com/en-us/cpp/c-runtime-library/run-t...) does a decent job. But in the name of security you will find zillion variations(_s, _l) of an API and by default msvc compiler will not let you use some of the API in their simple forms. Instead, you have to define _CRT_SECURE_NO_WARNINGS etc. I think for someone just getting started to develop/learn to write a decent code base in C these restrictions really hinder the productivity. So finally, I have decided to instead focus my learning on Linux platform(currently through WSL - Windows subsystem for Linux) with its POSIX apis. You know what, `man 3 printf` or `man 3 strlen` is soooooo much better than googling msdn

Mastering C

~~~~~~~

I think, the simple and straight answer here is reading good code and writing "good" code and also reading good C content(be it books or articles). I think these are the three ingredients necessary to get started. Of all the open source projects that I have investigated, I found Linux Kernel and related projects seems to have very good taste in terms of code quality. Currently, I am just focused how they use the language rather than what they actually do in the project. Things like, how they structure the project, how they name things, how they use types, how they create structures, how they pass structures to function, how they use light weight object based constructs, how they handle errors in function(for example forward only goto exits), how they use signed/unsigned variables etc(more of my learnings to the end), how they use their own data structures. I think its good to initially focus/target on ANSI C API with C99 instead of heavily relying on the OS specific API on which ever platform you choose. For example, such projects could be writing binary file parsers for example projects like .ISO file format etc.

Good C projects/articles

~~~~~~~~~~~~~~~

1. Winlib.net - https://github.com/jcpowermac/wimlib is a great source of information

2. e2fsprogs - https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/

3. MUSL - https://git.musl-libc.org/cgit/musl/tree/

4. General C Coding Style - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

4. https://nullprogram.com/tags/c/ - great source of C knowledge

5. CCAN - https://github.com/rustyrussell/ccan/tree/master/ccan - great source of C tidbits from none other than Rusty Russell - I haven't read all of them

6. POSIX 2018 standard - https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...

continued in the comment....


My Learnings(know your language/know your complier/know your tools)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. _t suffix is the notion used to denote a typedef for a given type /* This example is from do_journal.c in e2fsprogs / struct journal_transaction_s { ... blk64_t start, end; ... }; typedef struct journal_transaction_s journal_transaction_t;

2. Know about headers like stddef.h and stdint.h and when they are supposed to be used. for example: When to use normal data types like int vs int16_t etc.

3. From https://en.cppreference.com/w/c/types/integer we can sense that int_t are of exact width types which might have some perf side effects if the underlying hardware does not support the width natively.

    For example, in visual studio x86/x64 we have typedef short int16_t; and
    typedef int int32_t; int_fast_t* on the other hand make sure a suitable
    width which maps natively to the available hardware type For example, in
    visual studio x86/x64 typedef int int_fast16_t; instead of typedef short
    int_fast16_t; size_t on the other hand alias to the natural unsigned word
    length of the hardware for example on x86 typedef unsigned int size_t; and
    on x64 typedef unsigned __int64 size_t;
4. Know your compiler predefined standard macros On Microsoft compiler _WIN64 - defined when we are compiling for x64 code _WIN32 - defined when both x86 and x64 code is getting compiled _MSC_VER - defines which compiler version are we using, indicate different visual studio versions __cplusplus - defined when the translation unit is compiled as C++

5. We can get a FILE* from HANDLE using below APIs from io.h and fcntl.h Fd = _open_osfhandle((intptr_t)Handle, _O_TEXT); File = _wfdopen(Fd, L"r"); Once we get the FILE* we can use fgets for line oriented string operations

6. Learned about var args and aligned memory Aligned memory means the address returned by the _aligned_malloc is always divisible by the alignment we specify. For example: char p = _aligned_malloc(10, 4); the address return in p will be always be divisible by 4. We should also free the allocated memory using _aligned_free(p)

7. atoi(str) this api also processes input string until it can convert. For example atoi("123asda") will still give 123 as the return result. Any whitespace in the beginning of the input string will be ignored. So atoi(" 123asd") will still return 123. It is recommended to use strto functions to convert strings to int/long/float types as they also can return pointer to the character which is a non integer

8. UCRT support around 40 POSIX system level APIs but most of these have _ prefix to them. wimlib in wimlib_tchar.h defines #define topen _open for Win32 and #define topen open for POSIX systems The take away here is the UCRT implementation even though differ in name the parameters are exactly the same.

    For example:
      UCRT Win32: int _open(const char *filename, int oflag, int pmode);
      POSIX:      int  open(const char *pathname, int flags, mode_t mode);
9. We can install only build tools(VC compiler) excluding IDE from https://aka.ms/buildtools

10. Best video on C Standard and some of its less known features - "New" Features in C - Dan Saks Year C Standard Comments 1983 C standard committee is formed 1989 C89 C89 US standard 1990 C90 C89 International Standard 1999 C99 C99 Standard 2011 C11 C11 Standard 2018 C18 C18 Bugfix release

    _reserved - Reserved for global scope. But we can use any identifier with an
    _ as a local variable or a structure member

    __reserved - Always reserved. Meaning the user program should not use any
    variable with two underscores __

    _Reserved - Always reserved. Meaning the user program should not use any
    variable with an underscore and capital letter.

    This is the reason why _Bool is named that way to prevent breaking existing
    bool typedef used in existing code.
11. Another good video on lesser known C features - Choosing the Right Integer Types in C and C++ - Dan Saks - code::dive 2018

    we can use CHAR_BIT from limits.h instead of 8 for example when you want to
    print the bits in a integer, we can do below `for (size_t i = sizeof(int) *
    CHAR_BIT; i >= 0; i--) {...}`
12. size_t denotes the native architecture supported natural word size. So for 32bit it is 4 bytes unsigned quantity and for 64bit it is 8 bytes unsigned quantity. Hence it is defined as follows

    #ifdef _WIN64
        typedef unsigned __int64 size_t;   //8 bytes on x64
    #else
        typedef unsigned int     size_t;   //4 bytes on x86
    #endif

    where as uintmax_t denotes the maximum integer type that is available in the
    language. So on a 32bit you could still represent a 64 bit quantity using
    long long even though it not what the architecture directly maps to. So
    below is how it is defined in both x86 and x64

    typedef unsigned long long uintmax_t;  //in MSVC both x86 and x64 support 64
    bit quantities using long long. So size_t does not give us the maximum
    unsigned integer, instead it gives us the native unsigned integer i.e., on
    x86 it will be 32bits and on x64 it is 64bits. So recommendation is to use
    size_t where ever possible instead of using int. for example.

    int len = strlen(str); // not recommended because on both x86 and x64 of MSVC int is 4 bytes due to LLP64
    size_t len = strlen(str); // recommended because size_t will automatically maps to 4 bytes in x86 and 8 bytes in x64
13. C11 introduced the concept of static asserts. These are basically conditional asserts which can be evaluated during compile time. So C11 has a new keyword called _Static_assert(expr, message) The reason for this ugly name is the same idea of not to break existing code. so for convenience assert.h header provides static_assert macro which mean the same. One of the use of static asserts is below

    struct book {
      int pages;
      char author[10];
      float price;
    };

    static_assert(sizeof(struct book) == sizeof(int) + 10 * sizeof(char) + sizeof(float),
                  "structure contains padding holes!");
14. Another good video on some low level details - Storage Duration and Linkage in C and C++ - Dan Saks

15. #define _CRT_SECURE_NO_WARNINGS can be used to disable CRT warning for common functions.

16. Any ucrt function which begins with _ is a non standard api provided by ucrt. For example in string.h's _strdup, _strlwr, _strrev are some. The take away here is, it is easy to identify which function is part of C standard and which are not. Interestingly some(not all) of these non standard functions are part of posix so in glibc(which implements posix) don't have _ in them.

17. All posix function in posix standard with [CX] annotation indicate Extension to the ISO C standard for example, below function from stdlib.h is posix extension. UCRT defines a similar api called _putenv, since this is not part of C standard, UCRT version has an _

    stdlib.h - posix
    [CX] int setenv(const char *, const char *, int);
    stdlib.h - ucrt
    int _putenv( const char *envstring );

    stdio.h - posix
    [CX] int fileno(FILE *);
    stdio.h - ucrt
    int _fileno( FILE *stream );
18. Learned about CGold: The Hitchhiker’s Guide to the CMake. An awesome tutorial about CMake. Now it is super easy to start a C project without worrying about the individual build systems.

    # CMakeLists.txt - minimum content
    cmake_minimum_required(VERSION 3.4)
    project(command_line_parser)
    add_executable(command_line_parser main.c)

    # commands to run to generate the respective native build files like vcxproj files
    # In below command -S standards for source directory path.
    # -B stands for final directory where vcxproj files are generated
    # CMake only generate one flavor (x64/x86) per project file, here we are generating x64 by specifying x64
    cmake -S . -B builds -G "Visual Studio 16 2019" -A x64
    # we can also use cmake-gui to do the above

    # Once vcxproj files are generated we can either directly build the proj files using Visual Studio
    # or better use cmake itself to build it for us from CMD using msbuild
    cmake --build builds


Hope these help.


> _t suffix is the notion used to denote a typedef for a given type /* This example is from do_journal.c in e2fsprogs / struct journal_transaction_s { ... blk64_t start, end; ... }; typedef struct journal_transaction_s journal_transaction_t;

This is undefined behavior in POSIX.


And

    unsigned total(unsigned a, unsigned b) { return a + b; }
is undefined behavior in C.


Speaking of the _t suffix. I think it's being largely abused: we say 'size_t' (instead of 'size') because it is not obvious whether 'size' is a type or not; on the other hand, in the case of, say, 'int32_t' it is clearly redundant and therefore has always looked kinda silly to me.


How are those awesome illustrations created. Which tool????


Looks a bit like balsamiq. But there's also this tool called "on-staff graphic designer/illustrator" that they could have used as well.


Mariko made all of those illustrations herself. Check out her “alternative introduction to promises” [1] on her personal blog.

[1] https://kosamari.com/notes/the-promise-of-a-burger-party


Self-production also always an option.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: