NRF9160 CMSIS DSP Library Integration and Code Size Explosion

When integrating the CMSIS DSP library into an NRF9160 project using the ARM Cortex-M33 core, developers often encounter significant code bloat. This issue arises when enabling multiple CMSIS DSP modules (e.g., FastMath, ComplexMath, Statistics, and Transform) via configuration flags in the Zephyr build system. The resulting binary size can approach 512 KB, which is impractical for resource-constrained embedded systems like the NRF9160. The root cause lies in the linker’s inability to automatically prune unused functions and files from the CMSIS DSP library, leading to the inclusion of unnecessary code in the final binary.

The CMSIS DSP library is designed to be modular, allowing developers to include only the required mathematical and signal processing functions. However, the Zephyr build system’s configuration flags (e.g., CONFIG_CMSIS_DSP=y) often result in the inclusion of entire directories and modules, regardless of whether their functions are actually used in the application. This behavior is exacerbated by the lack of fine-grained control over which specific functions are linked into the final binary.

To address this issue, developers must understand the relationship between the Zephyr build system, the CMSIS DSP library, and the linker’s behavior. The Zephyr build system relies on configuration flags to enable or disable features, but these flags do not provide granular control over individual functions within the CMSIS DSP library. As a result, the linker includes all functions from the enabled modules, even if they are never called by the application.

Linker Behavior and Unused Function Elimination

The primary cause of code bloat in this scenario is the linker’s inability to eliminate unused functions from the CMSIS DSP library. Modern linkers, such as those provided by GCC and ARM Compiler, typically include a feature called "Unused Function Elimination" (UFE) or "Garbage Collection." This feature allows the linker to remove functions and data that are not referenced by the application, reducing the final binary size. However, UFE relies on the linker’s ability to accurately determine which functions are used and which are not.

In the case of the CMSIS DSP library, UFE may fail to eliminate unused functions due to several factors. First, the library’s modular design often results in inter-dependencies between functions, making it difficult for the linker to determine which functions are truly unused. For example, a function in the FastMath module may call a function in the ComplexMath module, even if the application does not directly use the ComplexMath function. This creates a chain of dependencies that prevents the linker from removing the unused code.

Second, the Zephyr build system’s configuration flags may inadvertently include entire modules, even if only a small subset of their functions are needed. For instance, enabling CONFIG_CMSIS_DSP_TRANSFORM=y includes all functions in the Transform module, regardless of whether they are used. This lack of granularity exacerbates the code bloat issue.

Finally, the linker’s UFE feature may be hindered by the presence of weak symbols or inline functions in the CMSIS DSP library. Weak symbols are often used to provide default implementations that can be overridden by the application. However, they can also confuse the linker, making it difficult to determine whether a function is truly unused. Similarly, inline functions are typically included in the final binary, even if they are not called, as they are expanded at compile time rather than linked.

Fine-Grained CMSIS DSP Library Integration and Linker Optimization

To mitigate code bloat in NRF9160 projects using the CMSIS DSP library, developers must adopt a fine-grained approach to library integration and linker optimization. This involves selectively including only the required functions and modules, as well as configuring the linker to maximize unused function elimination.

Step 1: Identify Required CMSIS DSP Functions

The first step is to identify the specific CMSIS DSP functions used by the application. This can be done by analyzing the application code and determining which mathematical or signal processing operations are performed. For example, if the application only requires Fast Fourier Transform (FFT) functions, there is no need to include the entire Transform module.

Step 2: Manually Include Required Functions

Once the required functions are identified, developers can manually include them in the project instead of relying on the Zephyr build system’s configuration flags. This involves copying the necessary source files from the CMSIS DSP library into the project directory and modifying the build system to compile only these files. For example, if the application requires the arm_cfft_f32 function from the Transform module, the corresponding source file (e.g., arm_cfft_f32.c) should be copied and included in the build.

Step 3: Configure Linker for Unused Function Elimination

To maximize the effectiveness of the linker’s UFE feature, developers should ensure that the linker is configured to aggressively eliminate unused functions. This typically involves enabling linker optimizations and specifying appropriate flags. For example, in GCC, the -ffunction-sections and -fdata-sections flags can be used to place each function and data object in its own section, allowing the linker to remove unused sections. Additionally, the --gc-sections flag should be used to enable garbage collection.

Step 4: Verify Binary Size and Function Inclusion

After implementing the above steps, developers should verify the final binary size and ensure that only the required functions are included. This can be done by analyzing the linker map file, which provides detailed information about the functions and data objects included in the binary. The map file can be used to confirm that unused functions have been eliminated and that the binary size has been reduced.

Step 5: Optimize Further with Conditional Compilation

For additional optimization, developers can use conditional compilation to exclude unused code at compile time. This involves wrapping CMSIS DSP functions in preprocessor directives (e.g., #ifdef) and defining corresponding macros in the build system. For example, if the application only uses floating-point FFT functions, the build system can define a macro (e.g., USE_FLOAT_FFT) and conditionally compile only the relevant code.

Step 6: Leverage Compiler-Specific Features

Finally, developers should leverage compiler-specific features to further reduce code size. For example, ARM Compiler provides the --split_sections flag, which splits functions into smaller sections, allowing the linker to remove unused parts of functions. Similarly, GCC’s -Os flag optimizes for size by enabling a set of optimizations that reduce code size without significantly impacting performance.

By following these steps, developers can significantly reduce the code bloat caused by the CMSIS DSP library in NRF9160 projects. This approach not only improves resource utilization but also ensures that the final binary is optimized for the target hardware.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *