PDFlib
GmbH
München
,
Germany
www
.
pdflib
.
com
FontReporter
1
.
3
®
A
plugin
for
analyzing
fonts
in
PDF
Copyright
©
2005-2008
PDFlib
GmbH
.
All
rights
reserved
.
PDFlib
GmbH
Franziska-Bilek-Weg
9
,
80339
München
,
Germany
www
.
pdflib
.
com
This
publication
and
the
information
herein
is
furnished
as
is
,
is
subject
to
change
without
notice
,
and
should
not
be
construed
as
a
commitment
by
PDFlib
GmbH
.
PDFlib
GmbH
assumes
no
responsibility
or
liability
for
any
errors
or
inaccuracies
,
makes
no
warranty
of
any
kind
(
express
,
implied
or
statutory
)
with
respect
to
this
publication
,
and
expressly
disclaims
any
and
all
warranties
of
merchantability
,
fitness
for
particular
purposes
and
noninfringement
of
third
party
rights
.
Adobe
,
Acrobat
,
and
PostScript
are
trademarks
of
Adobe
Systems
Inc
.
AIX
,
IBM
,
OS
/
390
,
WebSphere
,
iSeries
,
and
zSeries
are
trademarks
of
International
Business
Machines
Corporation
.
ActiveX
,
Microsoft
,
Windows
,
and
Windows
NT
are
trademarks
of
Microsoft
Corporation
.
Apple
,
Macintosh
and
TrueType
are
trademarks
of
Apple
Computer
,
Inc
.
Unicode
and
the
Unicode
logo
are
trademarks
of
Unicode
,
Inc
.
Unix
is
a
trademark
of
The
Open
Group
.
Java
and
Solaris
are
a
trademark
of
Sun
Microsystems
,
Inc
.
Other
company
product
and
service
names
may
be
trademarks
or
service
marks
of
others
.
Thank
you
for
using
PDFlib
FontReporter
,
a
free
Acrobat
plugin
provided
by
PDFlib
GmbH
.
PDFlib
GmbH
offers
software
for
creating
and
processing
PDF
documents
.
Please
visit
our
Web
site
to
learn
more
about
our
products
.
You
can
use
PDFlib
FontReporter
free
of
charge
;
however
,
it
is
not
in
the
public
domain
.
This
software
cannot
be
sold
or
redistributed
(
whether
for
a
fee
or
at
no
charge
),
either
stand-alone
or
in
combination
with
any
other
product
,
without
the
express
written
permission
of
PDFlib
GmbH
.
Although
PDFlib
FontReporter
is
not
a
commercial
product
,
we
strive
to
provide
high
quality
.
If
you
run
into
problems
you
are
encouraged
to
contact
us
at
support@pdflib
.
com
.
Contents
1
Installing
PDFlib
FontReporter
5
2
Working
with
FontReporter
7
2
.
1
What
can
you
do
with
FontReporter
?
7
2
.
2
Overview
of
PDF
Font
Formats
9
2
.
3
Contents
of
a
Font
Report
11
2
.
4
Investigate
PDF
Problems
with
FontReporter
14
2
.
5
Error
Messages
15
A
Revision
History
17
Contents
3
4
Contents
1
Installing
PDFlib
FontReporter
Requirements
.
The
PDFlib
FontReporter
plugin
works
with
Acrobat
6
/
7
/
8
Standard
and
Professional
on
Windows
and
Mac
,
and
Acrobat
9
Standard
,
Pro
and
Pro
Extended
on
Windows
.
The
plugin
doesn
’
t
work
with
Acrobat
Elements
or
any
version
of
Acrobat
Reader
/
Adobe
Reader
.
Installing
FontReporter
on
Windows
.
All
plugin-related
files
must
be
copied
to
the
subdirectory
»
PDFlib
FontReporter
«
in
the
Acrobat
plugin
folder
.
This
is
done
automatically
by
the
plugin
installer
,
but
can
also
be
done
manually
.
A
typical
location
of
the
plugin
folder
is
as
follows
:
C
:\
Program
Files
\
Adobe
\
Acrobat
9
.
0
\
Acrobat
\
plug_ins
\
PDFlib
FontReporter
Installing
FontReporter
for
Acrobat
6
/
7
/
8
on
the
Mac
.
With
Acrobat
6
/
7
/
8
the
plugin
folder
is
not
visible
in
the
finder
.
Make
sure
that
Acrobat
is
not
running
and
follow
these
steps
:
>
Extract
the
plugin
files
by
double-clicking
the
disk
image
(.
dmg
).
>
Locate
the
Acrobat
application
icon
in
the
finder
.
It
is
usually
located
in
a
folder
which
has
a
name
similar
to
the
following
:
/
Applications
/
Adobe
Acrobat
8
.
0
Professional
>
Single-click
on
the
Acrobat
application
icon
and
select
File
,
Get
Info
.
>
In
the
window
that
pops
up
click
the
triangle
next
to
Plug-ins
.
>
Click
Add
...
and
select
the
FontReporter
folder
from
the
folder
which
has
been
created
in
the
first
step
.
Note
that
this
folder
will
not
immediately
show
up
in
the
list
of
plugins
,
but
only
when
you
open
the
info
window
next
time
.
Multi-lingual
Interface
.
FontReporter
supports
multiple
languages
in
the
user
interface
and
generated
font
reports
.
Depending
on
the
application
language
of
Acrobat
,
FontReporter
will
choose
its
interface
language
automatically
.
Currently
English
and
German
interfaces
are
available
.
If
Acrobat
runs
in
any
other
language
mode
,
Font-
Reporter
will
use
the
English
interface
.
Trouble-shooting
.
If
the
FontReporter
plugin
doesn
’
t
seem
to
work
,
make
sure
that
in
Edit
,
Preferences
,
[
General
...],
Startup
the
»
Use
only
certified
plug-ins
«
box
is
unchecked
.
Chapter
1
:
Installing
PDFlib
FontReporter
5
6
Chapter
1
:
Installing
PDFlib
FontReporter
2
Working
with
FontReporter
2
.
1
What
can
you
do
with
FontReporter
?
FontReporter
is
a
useful
tool
if
you
are
interested
in
fonts
within
PDF
documents
.
It
provides
font-
and
encoding-related
information
which
will
helps
in
a
variety
of
situations
:
>
analyze
printing
problems
(
e
.
g
.
a
particular
font
causes
printing
errors
)
>
investigate
text
extraction
problems
(
e
.
g
.
copying
text
from
a
PDF
results
in
garbage
)
>
visualize
Unicode
mappings
for
a
font
>
find
flaws
in
the
PDF
creation
workflow
(
e
.
g
.
printer
driver
converted
a
PostScript
Type
1
font
to
Type
3
)
>
test
whether
ToUnicode
mapping
tables
(
required
for
PDF
/
A
compliance
)
are
present
>
identify
logos
and
symbols
which
are
represented
as
text
in
a
PDF
>
learn
which
fonts
are
contained
in
a
PDF
,
and
which
glyphs
they
contain
(
e
.
g
.
the
file
size
is
too
large
because
some
fonts
ended
up
in
the
PDF
unintentionally
)
>
check
font
subsets
to
see
which
glyphs
are
contained
in
the
subset
>
learn
more
about
PDF
font
technology
Using
FontReporter
is
as
easy
as
bringing
up
the
menu
Plug-Ins
,
PDFlib
FontReporter
...,
Create
Font
Report
in
Acrobat
.
This
will
create
a
font
report
for
all
pages
of
the
current
PDF
document
as
a
separate
PDF
.
Two
pages
from
typical
font
reports
are
shown
in
Figure
2
.
1
.
Fig
.
2
.
1
Sample
font
reports
Chapter
2
:
Working
with
FontReporter
7
Supported
PDF
and
font
formats
.
FontReporter
supports
all
PDF
versions
up
to
PDF
1
.
8
,
the
file
format
created
by
Acrobat
9
.
All
font
and
encoding
formats
available
in
PDF
are
supported
,
as
well
as
all
types
of
embedded
font
data
.
Advantages
over
Acrobat
’
s
font
properties
panel
.
All
versions
of
Acrobat
including
Adobe
Reader
provide
font
information
via
File
,
Document
Properties
...,
Fonts
.
However
,
Acrobat
’
s
font
overview
is
limited
in
use
;
FontReporter
provides
the
following
advantages
compared
to
Acrobat
’
s
font
list
:
>
FontReporter
provides
much
more
information
about
each
font
>
FontReporter
deals
with
CJK
font
names
even
on
Western
systems
>
FontReporter
provides
glyph
tables
containing
the
glyphs
of
a
font
along
with
their
widths
,
names
,
and
Unicode
values
>
FontReporter
presents
the
output
as
a
PDF
document
so
that
you
can
save
or
print
it
>
FontReporter
is
guaranteed
to
process
the
full
document
,
regardless
of
which
pages
have
already
been
displayed
in
Acrobat
PDF
text
extraction
with
PDFlib
TET
.
FontReporter
is
an
auxiliary
tool
to
our
PDFlib
Text
Extraction
Toolkit
(
TET
).
TET
is
software
for
extracting
the
text
contents
of
PDF
documents
.
It
is
available
both
as
a
standalone
program
and
a
programming
library
/
component
which
can
be
integrated
into
existing
software
.
TET
extracts
text
from
all
kinds
of
PDF
documents
and
normalizes
the
text
to
Unicode
.
FontReporter
can
be
used
to
create
Unicode
mapping
tables
for
PDF
documents
which
do
not
contain
enough
information
for
extracting
text
,
or
which
contain
wrong
Unicode
mapping
tables
.
Fully
functional
evaluation
versions
of
TET
are
available
for
download
from
www
.
pdflib
.
com
.
TET
PDF
IFilter
.
TET
PDF
IFilter
extracts
text
and
metadata
from
PDF
documents
and
makes
it
available
to
search
and
retrieval
software
on
Windows
.
This
allows
PDF
documents
to
be
searched
on
the
local
desktop
,
a
corporate
server
,
or
the
Web
.
TET
PDF
IFilter
is
based
on
the
patented
PDFlib
Text
Extraction
Toolkit
(
TET
).
TET
PDF
IFilter
is
a
robust
implementation
of
Microsoft
’
s
IFilter
indexing
interface
.
It
works
with
all
search
and
retrieval
products
which
support
the
IFilter
interface
,
e
.
g
.
SharePoint
and
SQL
Server
.
Fully
functional
evaluation
versions
of
TET
PDF
IFilter
are
available
for
download
from
www
.
pdflib
.
com
.
Free
TET
Plugin
.
The
TET
Plugin
is
a
free
companion
to
the
FontReporter
Plugin
.
It
can
be
installed
in
Adobe
Acrobat
and
allows
interactive
use
of
the
Text
Extraction
Toolkit
(
TET
)
with
any
PDF
document
that
is
currently
open
in
Acrobat
.
Using
the
TET
plugin
you
can
access
TET
’
s
functionality
and
experiment
with
TET
options
.
The
TET
plugin
can
freely
be
downloaded
from
www
.
pdflib
.
com
.
8
Chapter
2
:
Working
with
FontReporter
2
.
2
Overview
of
PDF
Font
Formats
PDF
supports
a
confusing
array
of
font
formats
,
the
details
of
which
can
get
confusing
.
In
order
to
help
you
interpret
the
reports
created
by
FontReporter
we
provide
a
quick
summary
of
PDF
font
formats
and
their
most
important
properties
.
While
the
format
of
a
font
in
a
PDF
document
depends
on
the
format
of
the
original
font
used
to
compose
the
document
,
this
is
not
the
only
factor
which
plays
a
role
here
.
Other
factors
include
the
configuration
options
in
the
PDF-creating
software
,
the
settings
of
the
printer
driver
used
to
generate
PostScript
data
for
PDF
conversion
,
the
set
of
characters
in
the
document
,
the
overall
number
of
used
characters
,
and
more
.
A
particularly
important
aspect
is
the
distinction
between
simple
fonts
and
composite
(
CID
)
fonts
.
Simple
fonts
.
Simple
fonts
comprise
the
PostScript
Type
1
(
ncluding
Multiple
Master
),
TrueType
,
and
Type
3
types
,
and
are
addressed
with
8-bit
codes
.
They
are
therefore
limited
a
maximum
of
256
characters
.
Simple
fonts
use
a
name-based
encoding
,
which
is
a
table
for
mapping
the
character
codes
to
the
glyphs
in
the
font
.
Composite
(
CID
)
fonts
.
Composite
or
CID
(
character
ID
)
fonts
come
in
PostScript
and
TrueType
flavors
.
They
can
contain
up
to
65535
characters
and
are
much
more
flexible
than
simple
fonts
.
While
CID
fonts
often
use
2-byte
codes
for
addressing
the
glyphs
in
the
font
,
more
complicated
schemes
with
a
variable
number
of
bytes
per
character
(
1-4
)
are
used
for
CJK
fonts
.
Instead
of
an
encoding
table
CID
fonts
require
a
CMap
(
Character
Map
)
for
providing
the
mapping
from
character
codes
to
actual
glyphs
in
the
font
.
Dozens
of
predefined
CMaps
are
available
for
common
CJK
fonts
.
The
font
’
s
character
collection
specifies
a
particular
set
of
Chinese
,
Japanese
,
or
Korean
characters
.
So-called
Identity
CMaps
are
used
(
mainly
for
Western
fonts
)
in
order
to
directly
address
the
glyphs
in
a
font
without
any
intermediate
mapping
table
.
Comparison
of
PDF
font
formats
.
Table
2
.
1
details
the
font
formats
supported
in
PDF
,
and
explains
which
original
font
formats
can
be
converted
to
these
types
by
the
PDF
creation
software
.
Table
2
.
1
Font
formats
in
PDF
Name
Type1
Notes
Classic
PostScript
Type
1
fonts
.
In
addition
to
the
original
Type
1
format
they
can
also
be
embedded
as
CFF
(
Compressed
Font
Format
)
under
the
name
Type1C
(
Type
1
Compressed
).
These
fonts
are
the
result
of
classic
PostScript
Type
1
fonts
or
OpenType
fonts
with
PostScript
outlines
.
MMType1
(
In
Acrobat
:
MM
)
Multiple
Master
fonts
are
an
extension
of
the
Type
1
format
,
and
are
rarely
used
.
These
fonts
are
the
result
of
PostScript
Type
1
Multiple
Master
fonts
.
Type3
User-defined
fonts
,
i
.
e
.
the
glyphs
are
described
by
raw
vector
or
image
operations
instead
of
a
readymade
font
.
Type
3
fonts
are
always
embedded
.
They
are
mainly
intended
for
bitmapped
fonts
and
logo
fonts
.
These
fonts
are
often
the
result
of
a
printer
driver
converting
a
PostScript
Type
1
or
TrueType
font
to
a
bitmap
font
.
Some
applications
use
Type
3
fonts
for
achieving
special
effects
,
such
as
filling
an
area
with
a
pattern
.
Chapter
2
:
Working
with
FontReporter
9
Table
2
.
1
Font
formats
in
PDF
Name
TrueType
CIDFontType0
Notes
TrueType
fonts
can
directly
be
embedded
in
PDF
.
Since
not
all
parts
of
the
original
TrueType
font
are
required
in
PDF
the
embedded
font
data
does
not
necessarily
comprise
a
valid
TrueType
font
.
These
fonts
are
the
result
of
TrueType
or
Type
42
fonts
,
or
OpenType
fonts
with
TrueType
outlines
.
(
In
Acrobat
:
Type
1
(
CID
))
CID
font
with
PostScript
outlines
;
similar
to
Type
1
fonts
the
font
data
can
be
embedded
as
CFF
under
the
name
CIDFontType0C
.
These
fonts
are
the
result
of
OpenType
fonts
with
PostScript
outlines
.
CIDFontType2
(
In
Acrobat
:
TrueType
(
CID
))
CID
font
with
TrueType
outlines
.
These
fonts
are
the
result
of
TrueType
fonts
or
OpenType
fonts
with
TrueType
outlines
.
OpenType
Directly
embedding
OpenType
fonts
requires
PDF
1
.
6
.
In
contrast
to
CIDFontType0
it
allows
to
embed
the
full
OpenType
font
file
.
This
format
is
still
very
rare
.
These
fonts
are
the
result
of
OpenType
fonts
with
PostScript
outlines
.
10
Chapter
2
:
Working
with
FontReporter
2
.
3
Contents
of
a
Font
Report
FontReporter
collects
general
information
,
font-related
information
,
and
glyph
tables
for
all
fonts
in
a
PDF
.
These
categories
are
accessible
in
multiple
ways
:
>
bookmarks
contain
general
and
font-related
information
as
clickable
hypertext
>
overview
pages
contain
general
and
font-related
information
on
a
printable
page
;
the
overview
pages
contain
clickable
links
so
that
you
can
easily
navigate
to
a
particular
font
’
s
glyph
table
>
detailed
glyph
tables
repeat
the
font-related
information
,
and
list
the
glyphs
in
a
font
Clicking
one
of
the
bookmarks
or
using
the
links
in
the
overview
section
you
can
quickly
navigate
to
the
corresponding
glyph
tables
for
a
font
.
FontReporter
will
copy
the
fonts
from
the
original
document
to
the
font
report
without
any
modification
.
All
font
properties
(
e
.
g
.
embedding
and
encoding
)
will
remain
unchanged
.
Note
FontReporter
will
only
process
data
which
is
actually
represented
with
font
and
encoding
structures
in
the
PDF
.
Other
means
of
representing
text
are
ignored
,
such
as
images
containing
text
,
or
characters
which
are
drawn
with
vector
graphics
(
also
called
outline
text
).
2
.
3
.
1
General
Information
The
general
information
in
a
font
report
contains
of
the
following
:
>
file
name
of
the
original
PDF
document
>
number
of
pages
and
fonts
in
the
document
>
PDF
Producer
,
i
.
e
.
the
name
of
the
software
used
to
produce
the
PDF
(
not
the
software
used
to
compose
the
document
).
2
.
3
.
2
Font
and
Encoding
Information
For
each
font
in
the
document
the
following
pieces
of
information
will
be
provided
.
Font
name
.
If
the
font
name
starts
with
six
random
characters
and
a
plus
sign
,
the
font
is
a
subset
which
does
not
contain
all
characters
which
were
originally
present
in
the
font
.
The
subset
prefix
is
useful
when
multiple
subsets
of
the
same
font
are
embedded
in
a
document
.
CJK
font
names
will
be
displayed
in
their
native
spelling
if
possible
.
Font
type
.
The
PDF
font
type
;
see
Section
2
.
2
,
»
Overview
of
PDF
Font
Formats
«,
page
9
,
for
a
list
of
font
formats
.
Embedding
and
subsetting
status
.
The
embedding
status
of
the
font
,
including
subset
information
and
the
format
of
the
embedded
font
data
.
Encoding
.
The
encoding
defines
the
mapping
of
character
codes
to
glyphs
for
simple
fonts
.
This
may
be
the
name
of
one
of
PDF
’
s
predefined
encodings
WinAnsiEncoding
or
MacRomanEncoding
(
these
are
called
Ansi
and
Roman
in
Acrobat
)
or
custom
for
a
nonstandard
encoding
.
If
no
encoding
information
is
given
explicitly
,
but
the
encoding
which
is
built
into
the
font
must
be
used
,
the
word
builtin
is
displayed
instead
of
an
encoding
name
.
Chapter
2
:
Working
with
FontReporter
11
CMap
.
For
CID
fonts
the
name
of
a
predefined
CMap
(
character
map
)
will
be
listed
as
encoding
,
plus
the
name
of
the
corresponding
character
collection
(
Chinese
,
Japanese
,
Korean
,
or
Identity
).
Additional
font
information
.
If
a
simple
font
contains
a
CharSet
entry
(
a
list
with
the
names
of
all
glyphs
contained
in
a
subset
)
this
will
be
noted
.
Similarly
,
if
a
CID
font
contains
a
CIDSet
entry
(
a
list
with
all
CIDs
contained
in
a
subset
)
this
will
also
be
noted
.
Symbol
bit
.
The
symbol
bit
signals
that
a
font
contains
characters
outside
the
Adobe
standard
Latin
character
set
.
This
may
be
relevant
for
font
substitution
and
text
extraction
operations
.
For
example
,
a
non-embedded
font
with
the
symbol
bit
cannot
be
substituted
.
Unicode
mapping
table
.
IF
the
font
contains
an
explicit
ToUnicode
mapping
table
this
will
be
mentioned
.
ToUnicode
tables
are
crucial
for
text
extraction
and
search
operations
.
2
.
3
.
3
Glyph
Tables
FontReporter
will
create
detailed
glyph
tables
for
each
font
in
the
document
.
The
table
organization
depends
on
the
font
type
.
Regardless
of
the
font
type
,
the
rows
and
columns
of
each
table
will
be
numbered
from
0
to
F
(
these
are
the
hex
codes
for
the
numbers
0-15
);
empty
slots
in
the
tables
will
contain
a
small
dot
as
a
substitute
.
Table
organization
for
simple
fonts
.
Since
simple
fonts
can
address
at
most
256
glyphs
,
a
single
page
with
16x16
slots
is
sufficient
.
All
glyphs
present
in
the
font
/
encoding
combination
will
be
shown
.
Unencoded
glyphs
in
the
font
and
unused
encoding
entries
will
not
be
shown
.
Table
organization
for
CID
fonts
.
Since
CID
fonts
can
contain
thousands
of
glyphs
,
efficient
table
organization
is
important
for
achieving
compact
font
reports
while
at
the
same
time
providing
quick
access
to
particular
code
ranges
.
If
the
font
contains
a
CIDSet
table
,
only
the
CIDs
contained
in
this
table
will
be
shown
;
otherwise
the
CIDs
which
are
actually
used
in
the
document
will
be
shown
.
All
text
on
all
pages
of
the
document
will
be
processed
to
determine
the
set
of
used
CIDs
,
while
text
in
hypertext
elements
(
such
as
form
fields
)
will
be
ignored
.
A
separate
page
will
be
created
for
each
block
of
256
CIDs
.
The
starting
number
of
the
block
(
in
hex
)
and
the
number
of
glyphs
per
block
are
provided
in
the
page
heading
and
the
corresponding
bookmark
.
For
example
,
the
heading
CID
x0000
means
that
this
block
contains
CIDs
0000-00FF
(
hex
),
or
decimal
0-255
;
the
heading
CID
x0100
means
that
this
block
contains
CIDs
0100-01FF
(
hex
),
or
decimal
256-511
.
Empty
blocks
will
be
omitted
to
reduce
the
overall
size
of
the
font
report
.
Information
for
each
glyph
.
For
each
glyph
in
a
table
the
following
information
will
be
shown
:
>
A
gray
rectangle
showing
the
glyph
width
(
the
height
of
the
rectangle
is
constant
,
and
not
related
to
the
glyph
geometry
)
>
The
actual
glyph
will
be
displayed
.
Sometimes
the
font
does
not
contain
the
corresponding
glyph
description
.
In
this
case
the
font
’
s
.
notdef
glyph
will
be
displayed
;
depending
on
the
type
of
font
it
may
be
represented
as
a
space
character
,
hollow
box
,
12
Chapter
2
:
Working
with
FontReporter
crossed
box
,
or
similar
.
For
CID
fonts
with
vertical
writing
mode
the
glyphs
will
be
positioned
such
that
they
match
the
table
grid
.
>
For
simple
(
8-bit
)
fonts
the
name
of
the
glyph
will
be
shown
.
>
If
the
font
contains
a
Unicode
mapping
table
(
a
ToUnicode
CMap
),
the
glyph
’
s
Unicode
value
(
s
)
will
be
shown
in
U
+
xxxx
notation
if
available
.
Multiple
Unicode
values
may
be
present
for
glyphs
which
map
to
a
sequence
of
multiple
Unicode
characters
,
such
as
ligatures
and
fractions
.
If
a
ToUnicode
table
is
present
for
the
font
,
but
it
does
not
contain
an
entry
for
this
glyph
(
this
may
happen
for
non-textual
symbols
which
are
contained
in
a
text
font
),
FontReporter
will
display
the
string
(
missing
)
instead
.
If
the
Unicode
values
contain
a
surrogate
pair
(
two
UTF-16
values
)
the
corresponding
UTF-32
value
will
be
displayed
instead
.
Chapter
2
:
Working
with
FontReporter
13
2
.
4
Investigate
PDF
Problems
with
FontReporter
This
section
lists
some
common
problem
scenarios
along
with
hints
for
interpreting
the
font
report
in
order
to
identify
problems
.
Text
Extraction
does
not
work
.
FontReporter
is
very
useful
if
you
try
to
extract
text
from
a
PDF
(
using
PDFlib
TET
,
Adobe
Acrobat
,
or
any
other
tool
)
and
the
the
extracted
text
is
incomplete
or
wrong
.
Some
hints
:
>
For
simple
fonts
check
the
encoding
,
glyph
name
,
and
Unicode
mapping
.
In
many
cases
errors
in
the
PDF
,
font
,
or
encoding
are
easy
to
identify
.
In
PDFlib
TET
you
can
correct
many
kinds
of
errors
by
supplying
appropriate
processing
options
or
custom
Unicode
mapping
tables
.
>
For
some
font
/
encoding
combinations
text
extraction
will
only
work
if
a
proper
ToUnicode
mapping
table
is
present
.
The
font
report
will
tell
you
whether
or
not
this
data
structure
is
present
in
the
PDF
.
Complicated
Unicode
mappings
.
In
some
situations
the
Unicode
mapping
of
a
glyph
may
not
be
obvious
.
For
examples
,
a
font
’
s
Unicode
mapping
can
decompose
ligatures
into
multiple
constituent
characters
in
order
to
facilitate
text
extraction
.
On
the
other
hand
,
ligatures
sometimes
have
wrong
Unicode
mappings
which
thwart
text
extraction
.
Error
message
in
Acrobat
.
For
some
problematic
PDFs
Acrobat
will
complain
Cannot
find
or
create
font
XXX
.
Some
characters
may
not
display
or
print
correctly
.
and
display
bullets
instead
of
the
font
’
s
characters
.
This
usually
happens
when
the
font
is
neither
embedded
nor
installed
on
the
system
,
and
Acrobat
cannot
substitute
the
font
because
the
symbol
bit
is
set
.
Type
3
fonts
.
Type
3
fonts
may
cause
the
PDF
to
print
slowly
or
prevent
text
extracting
and
editing
.
The
font
report
will
provide
a
useful
overview
of
Type
3
fonts
and
the
glyphs
they
contain
.
Glyph
complement
of
font
subsets
.
Font
subsets
do
not
contain
all
glyphs
which
were
initially
available
in
a
font
.
The
font
report
displays
the
glyph
complement
(
set
of
available
glyphs
)
in
a
font
subset
.
Duplicate
fonts
.
In
some
situations
(
e
.
g
.
combining
pages
from
several
PDFs
)
multiple
subsets
of
the
same
font
may
be
present
in
a
PDF
,
or
even
multiple
instances
of
the
same
font
.
This
can
cause
file
size
bloat
or
even
printing
problems
.
The
font
report
will
clearly
identify
this
problem
.
Distinguish
text
and
graphics
.
Sometimes
it
is
difficult
to
see
whether
content
in
a
PDF
is
actually
represented
as
native
text
,
vector
graphics
,
or
an
image
.
The
font
report
helps
in
identifying
various
uncommon
scenarios
:
>
Text
which
is
represented
as
vector
graphics
(
outline
text
)
or
an
image
will
be
missing
from
the
font
report
.
>
If
a
logo
or
other
symbol
is
represented
by
a
special
font
you
can
easily
identify
the
logo
font
in
the
font
report
.
14
Chapter
2
:
Working
with
FontReporter
2
.
5
Error
Messages
Unaccessible
content
data
.
FontReporter
will
include
this
message
in
the
font
report
if
it
was
unable
to
enumerate
the
page
contents
for
determining
which
glyphs
are
used
in
the
document
.
Most
commonly
this
will
happen
with
encrypted
documents
,
but
also
if
the
page
description
is
damaged
.
Invalid
page
content
state
,
glyphs
cannot
be
drawn
.
This
message
may
appear
for
unembedded
fonts
which
cannot
be
processed
.
Acrobat
usually
complains
with
the
message
A
font
required
for
font
substitution
is
missing
,
or
substitutes
a
system-installed
font
.
Chapter
2
:
Working
with
FontReporter
15
16
Chapter
2
:
Working
with
FontReporter
A
Revision
History
Revision
history
of
this
manual
Date
Changes
July
3
,
2008
>
Minor
changes
for
FontReporter
1
.
3
(
new
:
support
for
Acrobat
9
on
Windows
)
March
26
,
2007
>
Minor
changes
for
FontReporter
1
.
2
(
new
:
support
for
Acrobat
8
on
Mac
OS
X
)
January
30
,
2006
>
Minor
additions
for
FontReporter
1
.
1
February
14
,
2005
>
Initial
version
for
FontReporter
1
.
0
.
0
Known
problems
in
this
version
.
We
are
currently
aware
of
the
following
minor
problems
:
>
In
rare
cases
the
glyphs
of
Type
3
fonts
may
appear
too
small
or
too
large
,
or
not
on
the
baseline
.
>
Although
PDF
Producer
entries
with
non-Latin
characters
will
be
displayed
properly
in
the
bookmarks
,
they
will
appear
garbled
in
the
overview
page
.
>
Sometimes
not
all
glyph
names
are
shown
for
simple
fonts
with
the
predefined
encodings
WinAnsi
or
MacRoman
.
>
Glyph
names
for
built-in
encodings
are
not
shown
,
nor
those
for
custom
encodings
which
are
based
on
a
font
’
s
built-in
encoding
.
>
Unicode
values
will
not
be
shown
for
CID
fonts
with
standard
CJK
character
collections
and
simple
fonts
with
the
predefined
encodings
WinAnsi
or
MacRoman
.
However
,
since
these
are
fixed
mappings
no
document-specific
information
is
lost
.
A
Revision
History
17