-
Notifications
You must be signed in to change notification settings - Fork 538
Update CMake build and dlib version #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Update CMake build and dlib version #221
Conversation
…name clashes when using later versions of C++ standard
* Add MITIE_API prefixes to C++ functions and classes to ensure symbols are exported to shared libraries * Rename MITIE_EXPORT->MITIE_API in mitie.h header * Convert const variables in conll_parser and ner_feature_extraction headers to constexpr
* Create CMakeLists for overarching project root folder and modularise build options (aiming to set defaults which align to original behaviour) * Update mitielib CMakeLists to assume it is called from parent project CMakeLists and implement new build options * Add ability to prepend mitie- prefix to tools and examples executable output files
… headers, C/C++ based tools and examples
…option to switch _example suffix on/off, error if system-installed dlib not found when MITIE_INTEGRATED_DLIB=OFF, omit lib prefix on Windows MinGW
ff47ba2
to
24f860c
Compare
Thanks for the PR, but I don't want to change a bunch of stuff that isn't causing users problems. Especially build system scripts. Those have worked for years across many platforms. Updates to stuff like that tends to break some subset of users. The same goes for updating dlib. MITIE doesn't need the updated dlib so updating it won't make MITIE better but might break some users using older compilers. And the PR is way too big :) |
Add directives for mitie.pc, mitie-config.cmake and mitie-version.cmake metadata generation and install rules for package discovery
24f860c
to
13ed622
Compare
Thanks for taking the time to review the pull request. I've removed the dlib version update to reduce noise. I've been compiling MITIE in an MSYS2 Windows environment while packaging and installing it as a shared library (alongside the tools and examples). Part of the changes are fixes to errors I was experiencing, such as linking errors with C++ tools and examples due to missing symbol exports. To resolve this, I introduced an MITIE_API macro and applied it to the relevant C++ classes. Example errors:
Other changes are primarily optional modifications that aim to preserve the current behaviour by default. I can separate these changes into smaller pull requests if that would be preferable. Could you please advise on which modifications you consider most viable for merging? For example, I assume fixing the C++ shared library prefixes is viable since current code breaks certain compilation scenarios (like on my MSYS2 Windows setup), however what about:
Additionally, it might be beneficial to establish some guidelines on the minimum supported tooling and language versions for future development. For example, could we consider requiring C++11 (or newer, if not already the minimum) and setting a minimum CMake version of 3.10 or 3.12? This would help eliminate the deprecation warnings we currently see with CMake 2.4/2.6, while ensuring that those who depend on older configurations can still use the existing stable releases. |
Well, why does the C++ stuff need to be exposed? MITIE is designed specifically to hide those C++ symbols and expose only a C API, which is then used via python or whatever when built as a shared library. Making shared libraries that expose C++ symbols is extremely complicated and basically doesn't work with normal C++ code on some platforms (e.g. mainly Windows, where just adding export statements to the classes is not going to work, the whole C++ API needs to be designed around the goal of making it work, which it isn't here.). Related to that, why are you trying to build C++ code with the MITIE code compiled into a shared library rather than building them the recommended way? |
I'm packaging MITIE for MSYS2 and wanting to use it in that environment, where libraries are generally compiled into DLLs (e.g. MSYS2's dlib is shared). It would also be more consistent with behaviour on Linux platforms where the symbols are exported fully by default in the current configuration. Additionally it also keeps the package with tools and examples smaller, but if there are strong reasons to keep the library static then I suppose this would probably be acceptable. I do note that the recommendation to build as static when using the C++ API is not clearly documented in the readme file and mitielib's CMakeLists.txt is configured to build as shared by default. I am aware that on Windows C++ DLLs generally need to be recompiled for each compiler, and there are potentially name mangling and ABI incompatibilities. However, MITIE is open source and MSYS2 is a relatively self-contained ecosystem, so this doesn't seem like a huge problem. Interested to know your views on whether there are deeper design constraints beyond this that need to be considered that might make a shared library fundamentally unsafe or unmanageable? Would an opt-in approach through a CMake option Most of the other suggestions could be deferred to the packaging process, but the fix for the C++ shared API would be beneficial to incorporate into the master repo (I'm not a fan of enabling an option to export all symbols like on Linux). Additionally, the option to build against a system-installed dlib would ideally be included as well. |
The problem though is on windows, dlls (ones made with visual studio at least, I'm not sure what MSYS2 ends up doing) have these memory allocation boundaries. So for instance, you can't have a function in a dll that's does this:
since what happens is someone calls Linux systems aren't setup this way, so shared libraries there are easy and work fine. But on windows, C++ shared libraries are hugely problematic because of this. You have to somehow construct the API such that client code never itself deletes memory that was allocated in your dll. That's the issue and it requires a carefully designed API to avoid that. Which hardly anyone does. Dlib certainly doesn't do that. The main MITIE API though, the C API, is explicitly designed for this so that you can use it as a dll. It's how the python tooling uses MITIE for instance. So that works well. But trying to get the C++ code in MITIE To work inside a dll is inevitably going to be a disaster because of this "separate memory allocation regions" thing windows dlls do. It's also why symbol export is off by default in most dll creation tools. Since it is so wildly dangerous and buggy to try and use C++ symbols across dll boundaries, specifically due to this issue. |
Windows DLLs are such a disaster. I was looking for some more reference on this, there used to be a good article that explained it but I'm not finding it. But for instance, https://devblogs.microsoft.com/oldnewthing/20060915-04/?p=29723 https://devblogs.microsoft.com/oldnewthing/20060915-04/?p=29723, https://docs.omniverse.nvidia.com/kit/docs/carbonite/latest/docs/memory/Memory.html etc. You can find a lot written about it. The TLDR is that any C++ code that is going to be the API for a DLL needs to be very carefully constructed around this limitation of DLLs. Normal C++ code can't be used. |
This pull request modernises MITIE's build process by improving shared library compilation, adding pkg-config and CMake package support, and modularising CMake configuration with new optionality, making packaging and distribution more flexible while preserving original behaviour by default.
mitielib : fix shared library compilation for C++ API
CMake build process uplift
dlib