| <?xml version="1.0"?> |
| <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" |
| "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [ |
| <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'"> |
| <!ENTITY version SYSTEM "version.xml"> |
| ]> |
| <chapter id="shaping-and-shape-plans"> |
| <title>Shaping and shape plans</title> |
| <para> |
| Once you have your face and font objects configured as desired and |
| your input buffer is filled with the characters you need to shape, |
| all you need to do is call <function>hb_shape()</function>. |
| </para> |
| <para> |
| HarfBuzz will return the shaped version of the text in the same |
| buffer that you provided, but it will be in output mode. At that |
| point, you can iterate through the glyphs in the buffer, drawing |
| each one at the specified position or handing them off to the |
| appropriate graphics library. |
| </para> |
| <para> |
| For the most part, HarfBuzz's shaping step is straightforward from |
| the outside. But that doesn't mean there will never be cases where |
| you want to look under the hood and see what is happening on the |
| inside. HarfBuzz provides facilities for doing that, too. |
| </para> |
| |
| <section id="shaping-buffer-output"> |
| <title>Shaping and buffer output</title> |
| <para> |
| The <function>hb_shape()</function> function call takes four arguments: the font |
| object to use, the buffer of characters to shape, an array of |
| user-specified features to apply, and the length of that feature |
| array. The feature array can be NULL, so for the sake of |
| simplicity we will start with that case. |
| </para> |
| <para> |
| Internally, HarfBuzz looks at the tables of the font file to |
| determine where glyph classes, substitutions, and positioning |
| are defined, using that information to decide which |
| <emphasis>shaper</emphasis> to use (<literal>ot</literal> for |
| OpenType fonts, <literal>aat</literal> for Apple Advanced |
| Typography fonts, and so on). It also looks at the direction, |
| script, and language properties of the segment to figure out |
| which script-specific shaping model is needed (at least, in |
| shapers that support multiple options). |
| </para> |
| <para> |
| If a font has a GDEF table, then that is used for |
| glyph classes; if not, HarfBuzz will fall back to Unicode |
| categorization by code point. If a font has an AAT <literal>morx</literal> table, |
| then it is used for substitutions; if not, but there is a GSUB |
| table, then the GSUB table is used. If the font has an AAT |
| <literal>kerx</literal> table, then it is used for positioning; if not, but |
| there is a GPOS table, then the GPOS table is used. If neither |
| table is found, but there is a <literal>kern</literal> table, then HarfBuzz will |
| use the <literal>kern</literal> table. If there is no <literal>kerx</literal>, no GPOS, and no |
| <literal>kern</literal>, HarfBuzz will fall back to positioning marks itself. |
| </para> |
| <para> |
| With a well-behaved OpenType font, you expect GDEF, GSUB, and |
| GPOS tables to all be applied. HarfBuzz implements the |
| script-specific shaping models in internal functions, rather |
| than in the public API. |
| </para> |
| <para> |
| The algorithms |
| used for complex scripts can be quite involved; HarfBuzz tries |
| to be compatible with the OpenType Layout specification |
| and, wherever there is any ambiguity, HarfBuzz attempts to replicate the |
| output of Microsoft's Uniscribe engine. See the <ulink |
| url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft |
| Typography pages</ulink> for more detail. |
| </para> |
| <para> |
| In general, though, all that you need to know is that |
| <function>hb_shape()</function> returns the results of shaping |
| in the same buffer that you provided. The buffer's content type |
| will now be set to |
| <literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating |
| that it contains shaped output, rather than input text. You can |
| now extract the glyph information and positioning arrays: |
| </para> |
| <programlisting language="C"> |
| hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); |
| hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); |
| </programlisting> |
| <para> |
| The glyph information array holds a <type>hb_glyph_info_t</type> |
| for each output glyph, which has two fields: |
| <parameter>codepoint</parameter> and |
| <parameter>cluster</parameter>. Whereas, in the input buffer, |
| the <parameter>codepoint</parameter> field contained the Unicode |
| code point, it now contains the glyph ID of the corresponding |
| glyph in the font. The <parameter>cluster</parameter> field is |
| an integer that you can use to help identify when shaping has |
| reordered, split, or combined code points; we will say more |
| about that in the next chapter. |
| </para> |
| <para> |
| The glyph positions array holds a corresponding |
| <type>hb_glyph_position_t</type> for each output glyph, |
| containing four fields: <parameter>x_advance</parameter>, |
| <parameter>y_advance</parameter>, |
| <parameter>x_offset</parameter>, and |
| <parameter>y_offset</parameter>. The advances tell you how far |
| you need to move the drawing point after drawing this glyph, |
| depending on whether you are setting horizontal text (in which |
| case you will have x advances) or vertical text (for which you |
| will have y advances). The x and y offsets tell you where to |
| move to start drawing the glyph; usually you will have both and |
| x and a y offset, regardless of the text direction. |
| </para> |
| <para> |
| Most of the time, you will rely on a font-rendering library or |
| other graphics library to do the actual drawing of glyphs, so |
| you will need to iterate through the glyphs in the buffer and |
| pass the corresponding values off. |
| </para> |
| </section> |
| |
| <section id="shaping-opentype-features"> |
| <title>OpenType features</title> |
| <para> |
| OpenType features enable fonts to include smart behavior, |
| implemented as "lookup" rules stored in the GSUB and GPOS |
| tables. The OpenType specification defines a long list of |
| standard features that fonts can use for these behaviors; each |
| feature has a four-character reserved name and a well-defined |
| semantic meaning. |
| </para> |
| <para> |
| Some OpenType features are defined for the purpose of supporting |
| complex-script shaping, and are automatically activated, but |
| only when a buffer's script property is set to a script that the |
| feature supports. |
| </para> |
| <para> |
| Other features are more generic and can apply to several (or |
| any) script, and shaping engines are expected to implement |
| them. By default, HarfBuzz activates several of these features |
| on every text run. They include <literal>abvm</literal>, |
| <literal>blwm</literal>, <literal>ccmp</literal>, |
| <literal>locl</literal>, <literal>mark</literal>, |
| <literal>mkmk</literal>, and <literal>rlig</literal>. |
| </para> |
| <para> |
| In addition, if the text direction is horizontal, HarfBuzz |
| also applies the <literal>calt</literal>, |
| <literal>clig</literal>, <literal>curs</literal>, |
| <literal>dist</literal>, <literal>kern</literal>, |
| <literal>liga</literal> and <literal>rclt</literal>, features. |
| </para> |
| <para> |
| Additionally, when HarfBuzz encounters a fraction slash |
| (<literal>U+2044</literal>), it looks backward and forward for decimal |
| digits (Unicode General Category = Nd), and enables features |
| <literal>numr</literal> on the sequence before the fraction slash, |
| <literal>dnom</literal> on the sequence after the fraction slash, |
| and <literal>frac</literal> on the whole sequence including the fraction |
| slash. |
| </para> |
| <para> |
| Some script-specific shaping models |
| (see <xref linkend="opentype-shaping-models" />) disable some of the |
| features listed above: |
| </para> |
| <itemizedlist> |
| <listitem> |
| <para> |
| Hangul: <literal>calt</literal> |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Indic: <literal>liga</literal> |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Khmer: <literal>liga</literal> |
| </para> |
| </listitem> |
| </itemizedlist> |
| <para> |
| If the text direction is vertical, HarfBuzz applies |
| the <literal>vert</literal> feature by default. |
| </para> |
| <para> |
| Still other features are designed to be purely optional and left |
| up to the application or the end user to enable or disable as desired. |
| </para> |
| <para> |
| You can adjust the set of features that HarfBuzz applies to a |
| buffer by supplying an array of <type>hb_feature_t</type> |
| features as the third argument to |
| <function>hb_shape()</function>. For a simple case, let's just |
| enable the <literal>dlig</literal> feature, which turns on any |
| "discretionary" ligatures in the font: |
| </para> |
| <programlisting language="C"> |
| hb_feature_t userfeatures[1]; |
| userfeatures[0].tag = HB_TAG('d','l','i','g'); |
| userfeatures[0].value = 1; |
| userfeatures[0].start = HB_FEATURE_GLOBAL_START; |
| userfeatures[0].end = HB_FEATURE_GLOBAL_END; |
| </programlisting> |
| <para> |
| <literal>HB_FEATURE_GLOBAL_END</literal> and |
| <literal>HB_FEATURE_GLOBAL_END</literal> are macros we can use |
| to indicate that the features will be applied to the entire |
| buffer. We could also have used a literal <literal>0</literal> |
| for the start and a <literal>-1</literal> to indicate the end of |
| the buffer (or have selected other start and end positions, if needed). |
| </para> |
| <para> |
| When we pass the <varname>userfeatures</varname> array to |
| <function>hb_shape()</function>, any discretionary ligature |
| substitutions from our font that match the text in our buffer |
| will get performed: |
| </para> |
| <programlisting language="C"> |
| hb_shape(font, buf, userfeatures, num_features); |
| </programlisting> |
| <para> |
| Just like we enabled the <literal>dlig</literal> feature by |
| setting its <parameter>value</parameter> to |
| <literal>1</literal>, you would disable a feature by setting its |
| <parameter>value</parameter> to <literal>0</literal>. Some |
| features can take other <parameter>value</parameter> settings; |
| be sure you read the full specification of each feature tag to |
| understand what it does and how to control it. |
| </para> |
| </section> |
| |
| <section id="shaping-shaper-selection"> |
| <title>Shaper selection</title> |
| <para> |
| The basic version of <function>hb_shape()</function> determines |
| its shaping strategy based on examining the capabilities of the |
| font file. OpenType font tables cause HarfBuzz to try the |
| <literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the |
| <literal>aat</literal> shaper. |
| </para> |
| <para> |
| In the real world, however, a font might include some unusual |
| mix of tables, or one of the tables might simply be broken for |
| the script you need to shape. So, sometimes, you might not |
| want to rely on HarfBuzz's process for deciding what to do, and |
| just tell <function>hb_shape()</function> what you want it to try. |
| </para> |
| <para> |
| <function>hb_shape_full()</function> is an alternate shaping |
| function that lets you supply a list of shapers for HarfBuzz to |
| try, in order, when shaping your buffer. For example, if you |
| have determined that HarfBuzz's attempts to work around broken |
| tables gives you better results than the AAT shaper itself does, |
| you might move the AAT shaper to the end of your list of |
| preferences and call <function>hb_shape_full()</function> |
| </para> |
| <programlisting language="C"> |
| char *shaperprefs[3] = {"ot", "default", "aat"}; |
| ... |
| hb_shape_full(font, buf, userfeatures, num_features, shaperprefs); |
| </programlisting> |
| <para> |
| to get results you are happier with. |
| </para> |
| <para> |
| You may also want to call |
| <function>hb_shape_list_shapers()</function> to get a list of |
| the shapers that were built at compile time in your copy of HarfBuzz. |
| </para> |
| </section> |
| |
| <section id="shaping-plans-and-caching"> |
| <title>Plans and caching</title> |
| <para> |
| Internally, HarfBuzz uses a structure called a shape plan to |
| track its decisions about how to shape the contents of a |
| buffer. The <function>hb_shape()</function> function builds up the shape plan by |
| examining segment properties and by inspecting the contents of |
| the font. |
| </para> |
| <para> |
| This process can involve some decision-making and |
| trade-offs — for example, HarfBuzz inspects the GSUB and GPOS |
| lookups for the script and language tags set on the segment |
| properties, but it falls back on the lookups under the |
| <literal>DFLT</literal> tag (and sometimes other common tags) |
| if there are actually no lookups for the tag requested. |
| </para> |
| <para> |
| HarfBuzz also includes some work-arounds for |
| handling well-known older font conventions that do not follow |
| OpenType or Unicode specifications, for buggy system fonts, and for |
| peculiarities of Microsoft Uniscribe. All of that means that a |
| shape plan, while not something that you should edit directly in |
| client code, still might be an object that you want to |
| inspect. Furthermore, if resources are tight, you might want to |
| cache the shape plan that HarfBuzz builds for your buffer and |
| font, so that you do not have to rebuild it for every shaping call. |
| </para> |
| <para> |
| You can create a cacheable shape plan with |
| <function>hb_shape_plan_create_cached(face, props, |
| user_features, num_user_features, shaper_list)</function>, where |
| <parameter>face</parameter> is a face object (not a font object, |
| notably), <parameter>props</parameter> is an |
| <type>hb_segment_properties_t</type>, |
| <parameter>user_features</parameter> is an array of |
| <type>hb_feature_t</type>s (with length |
| <parameter>num_user_features</parameter>), and |
| <parameter>shaper_list</parameter> is a list of shapers to try. |
| </para> |
| <para> |
| Shape plans are objects in HarfBuzz, so there are |
| reference-counting functions and user-data attachment functions |
| you can |
| use. <function>hb_shape_plan_reference(shape_plan)</function> |
| increases the reference count on a shape plan, while |
| <function>hb_shape_plan_destroy(shape_plan)</function> decreases |
| the reference count, destroying the shape plan when the last |
| reference is dropped. |
| </para> |
| <para> |
| You can attach user data to a shaper (with a key) using the |
| <function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function> |
| function, optionally supplying a <function>destroy</function> |
| callback to use. You can then fetch the user data attached to a |
| shape plan with |
| <function>hb_shape_plan_get_user_data(shape_plan, key)</function>. |
| </para> |
| </section> |
| |
| </chapter> |