2.4.6 Limitations of the interface
AllApplicationManualNameSummaryHelp

  • Documentation
    • Reference manual
    • Packages
      • A C++ interface to SWI-Prolog
        • A C++ interface to SWI-Prolog (Version 2)
          • Overview (version 2)
            • Limitations of the interface
              • Strings
              • Object handles

2.4.6.1 Strings

SWI-Prolog string handling has evolved over time. The functions that create atoms or strings using char* or wchar_t* are "old school"; similarly with functions that get the string as char* or wchar_t*. The PL_get_unify_put_[nw]chars() family is more friendly when it comes to different input, output, encoding and exception handling.

Roughly, the modern API is PL_get_nchars(), PL_unify_chars() and PL_put_chars() on terms. There is only half of the API for atoms as PL_new_atom_mbchars() and PL-atom_mbchars(), which take an encoding, length and char*.

However, there is no native "string" type in C++; the char* strings can be automatically cast to string. If a C++ interface provides only std::string arguments or return values, that can introduce some inefficiency; therefore, many of the functions and constructors allow either a char* or std::string as a value (also wchar_t* or std::wstring.

For return values, char* is dangerous because it can point to local or stack memory. For this reason, wherever possible, the C++ API returns a std::string, which contains a copy of the the string. This can be slightly less efficient that returning a char*, but it avoids some subtle and pervasive bugs that even address sanitizers can't detect.12If we wish to minimize the overhead of passing strings, this can be done by passing in a pointer to a string rather than returning a string value; but this is more cumbersome and modern compilers can often optimize the code to avoid copying the return value.

Many of the classes have a as_string() method - this might be changed in future to to_string(), to be consistent with std::to_string(). However, the method names such as as_int32_t() were chosen istntead of to_int32_t() because they imply that the representation is already an int32_t, and not that the value is converted to a int32_t. That is, if the value is a float, int32_t will fail with an error rather than (for example) truncating the floating point value to fit into a 32-bit integer.