Rizin
unix-like reverse engineering framework and cli tools
tuklib_mbstr.h File Reference

Utility functions for handling multibyte strings. More...

#include "tuklib_common.h"

Go to the source code of this file.

Macros

#define tuklib_mbstr_width   TUKLIB_SYMBOL(tuklib_mbstr_width)
 
#define tuklib_mbstr_fw   TUKLIB_SYMBOL(tuklib_mbstr_fw)
 

Functions

size_t tuklib_mbstr_width (const char *str, size_t *bytes)
 Get the number of columns needed for the multibyte string. More...
 
int tuklib_mbstr_fw (const char *str, int columns_min)
 Get the field width for printf() e.g. to align table columns. More...
 

Detailed Description

Utility functions for handling multibyte strings.

If not enough multibyte string support is available in the C library, these functions keep working with the assumption that all strings are in a single-byte character set without combining characters, e.g. US-ASCII or ISO-8859-*.

Definition in file tuklib_mbstr.h.

Macro Definition Documentation

◆ tuklib_mbstr_fw

#define tuklib_mbstr_fw   TUKLIB_SYMBOL(tuklib_mbstr_fw)

Definition at line 45 of file tuklib_mbstr.h.

◆ tuklib_mbstr_width

#define tuklib_mbstr_width   TUKLIB_SYMBOL(tuklib_mbstr_width)

Definition at line 24 of file tuklib_mbstr.h.

Function Documentation

◆ tuklib_mbstr_fw()

int tuklib_mbstr_fw ( const char *  str,
int  columns_min 
)

Get the field width for printf() e.g. to align table columns.

Printing simple tables to a terminal can be done using the field field feature in the printf() format string, but it works only with single-byte character sets. To do the same with multibyte strings, tuklib_mbstr_fw() can be used to calculate appropriate field width.

The behavior of this function is undefined, if

  • str is NULL or not terminated with '\0';
  • columns_min <= 0; or
  • the calculated field width exceeds INT_MAX.
Returns
If tuklib_mbstr_width(str, NULL) fails, -1 is returned. If str needs more columns than columns_min, zero is returned. Otherwise a positive integer is returned, which can be used as the field width, e.g. printf("%*s", fw, str).

Definition at line 17 of file tuklib_mbstr_fw.c.

18 {
19  size_t len;
20  const size_t width = tuklib_mbstr_width(str, &len);
21  if (width == (size_t)-1)
22  return -1;
23 
24  if (width > (size_t)columns_min)
25  return 0;
26 
27  if (width < (size_t)columns_min)
28  len += (size_t)columns_min - width;
29 
30  return len;
31 }
size_t len
Definition: 6502dis.c:15
int size_t
Definition: sftypes.h:40
int width
Definition: main.c:10
#define tuklib_mbstr_width
Definition: tuklib_mbstr.h:24

References len, cmd_descs_generate::str, tuklib_mbstr_width, and width.

◆ tuklib_mbstr_width()

size_t tuklib_mbstr_width ( const char *  str,
size_t bytes 
)

Get the number of columns needed for the multibyte string.

This is somewhat similar to wcswidth() but works on multibyte strings.

Parameters
strString whose width is to be calculated. If the current locale uses a multibyte character set that has shift states, the string must begin and end in the initial shift state.
bytesIf this is not NULL, *bytes is set to the value returned by strlen(str) (even if an error occurs when calculating the width).
Returns
On success, the number of columns needed to display the string e.g. in a terminal emulator is returned. On error, (size_t)-1 is returned. Possible errors include invalid, partial, or non-printable multibyte character in str, or that str doesn't end in the initial shift state.

Definition at line 22 of file tuklib_mbstr_width.c.

23 {
24  const size_t len = strlen(str);
25  if (bytes != NULL)
26  *bytes = len;
27 
28 #if !(defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH))
29  // In single-byte mode, the width of the string is the same
30  // as its length.
31  return len;
32 
33 #else
34  mbstate_t state;
35  memset(&state, 0, sizeof(state));
36 
37  size_t width = 0;
38  size_t i = 0;
39 
40  // Convert one multibyte character at a time to wchar_t
41  // and get its width using wcwidth().
42  while (i < len) {
43  wchar_t wc;
44  const size_t ret = mbrtowc(&wc, str + i, len - i, &state);
45  if (ret < 1 || ret > len)
46  return (size_t)-1;
47 
48  i += ret;
49 
50  const int wc_width = wcwidth(wc);
51  if (wc_width < 0)
52  return (size_t)-1;
53 
54  width += (size_t)wc_width;
55  }
56 
57  // Require that the string ends in the initial shift state.
58  // This way the caller can be combine the string with other
59  // strings without needing to worry about the shift states.
60  if (!mbsinit(&state))
61  return (size_t)-1;
62 
63  return width;
64 #endif
65 }
lzma_index ** i
Definition: index.h:629
static ut8 bytes[32]
Definition: asm_arc.c:23
#define NULL
Definition: cris-opc.c:27
return memset(p, 0, total)
Definition: dis.h:43

References bytes, i, len, memset(), NULL, cmd_descs_generate::str, and width.