pdf: October 2007

Sunday, October 28, 2007

pdf text

This chapter describes the special facilities in PDF for dealing with text— specifically, for representing characters with glyphs from fonts. A glyph is a graphical shape and is subject to all graphical manipulations, such as coordinate transformation. Because of the importance of text in most page descriptions, PDF provides higher-level facilities that permit an application to describe, select, and render glyphs conveniently and efficiently.

The first section is a general description of how glyphs from fonts are painted on the page. Subsequent sections cover the following topics in detail:

•Text state. A subset of the graphics state parameters pertain to text, including parameters that select the font, scale the glyphs to an appropriate size, and accomplish other graphical effects.

•Text objects and operators. The text operators specify the glyphs to be painted, represented by string objects whose values are interpreted as sequences of character codes. A text object encloses a sequence of text operators and associated parameters.

•Font data structures. Font dictionaries and associated data structures provide information that a consumer application needs to interpret the text and position the glyphs properly. The definitions of the glyphs themselves are contained in font programs, which may be embedded in the PDF file, built into the application, or obtained from an external font file.

full pdf text
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

pdf graphics

The graphics operators used in PDF content streams describe the appearance of pages that are to be reproduced on a raster output device. The facilities described in this chapter are intended for both printer and display applications.

The pdf graphics operators form six main groups:

•PDF Graphics state operators manipulate the data structure called the graphics state, the global framework within which the other graphics operators execute. The graphics state includes the current transformation matrix (CTM), which maps user space coordinates used within a PDF content stream into output device coordinates. It also includes the current color, the current clipping path, and many other parameters that are implicit operands of the painting operators.

•Path construction operators specify paths, which define shapes, line trajectories, and regions of various sorts. They include operators for beginning a new path, adding line segments and curves to it, and closing it.

•Path-painting operators fill a path with a color, paint a stroke along it, or use it as a clipping boundary.

•Other painting operators paint certain self-describing graphics objects. These include sampled images, geometrically defined shadings, and entire content streams that in turn contain sequences of graphics operators.

•Text operators select and show character glyphs from fonts (descriptions of typefaces for representing text characters). Because PDF treats glyphs as general graphical shapes, many of the text operators could be grouped with the graphics state or painting operators. However, the data structures and mechanisms for dealing with glyph and font descriptions are sufficiently specialized that Chapter 5 focuses on them.

•Marked-content operators associate higher-level logical information with objects in the content stream. This information does not affect the rendered appearance of the content (although it may determine if the content should be presented at all; see Section 4.10, "Optional Content"); it is useful to applications that use PDF for document interchange. Marked content is described in Section 10.5, "Marked Content."

This chapter presents general information about device-independent graphics in PDF: how a PDF content stream describes the abstract appearance of a page. Rendering—the device-dependent part of graphics—is covered in Chapter 6. The Bibliography lists a number of books that give details of these computer graphics concepts and their implementation.

full pdf graphics
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

pdf syntax

This chapter covers everything about the syntax of PDF at the object, file, and document level. It sets the stage for subsequent chapters, which describe how the contents of a PDF file are interpreted as page descriptions, interactive navigational aids, and application-level logical structure.

PDF syntax is best understood by thinking of it in four parts, as shown in Figure 3.1:

•Objects. A PDF document is a data structure composed from a small set of basic types of data objects. Section 3.1, "Lexical Conventions," describes the character set used to write objects and other syntactic elements. Section 3.2, "Objects," describes the syntax and essential properties of the objects. Section 3.2.7, "Stream Objects," provides complete details of the most complex data type, the stream object.

•File structure. The PDF file structure determines how objects are stored in a PDF file, how they are accessed, and how they are updated. This structure is independent of the semantics of the objects. Section 3.4, "File Structure," describes the file structure. Section 3.5, "Encryption," describes a file-level mechanism for protecting a document’s contents from unauthorized access.

•Document structure. The PDF document structure specifies how the basic object types are used to represent components of a PDF document: pages, fonts, annotations, and so forth. Section 3.6, "Document Structure," describes the overall document structure; later chapters address the detailed semantics of the components.

•Content streams. A PDF content stream contains a sequence of instructions describing the appearance of a page or other graphical entity. These instructions, while also represented as objects, are conceptually distinct from the objects that represent the document structure and are described separately. Section 3.7, "Content Streams and Resources," discusses PDF content streams and their associated resources.

full pdf syntax
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

pdf overview

PDF is a file format for representing documents in a manner independent of the application software, hardware, and operating system used to create them and of the output device on which they are to be displayed or printed. A PDF documentconsists of a collection of objects that together describe the appearance of one or more pages, possibly accompanied by additional interactive elements and higher-level application data. A PDF file contains the objects making up a PDF document along with associated structural information, all represented as a single self-contained sequence of bytes.

A document’s pages (and other visual elements) can contain any combination of text, graphics, and images. A page’s appearance is described by a PDF content stream, which contains a sequence of graphics objects to be painted on the page. This appearance is fully specified; all layout and formatting decisions have already been made by the application generating the content stream.

In addition to describing the static appearance of pages, a PDF document can contain interactive elements that are possible only in an electronic representation. PDF supports annotations of many kinds for such things as text notes, hypertext links, markup, file attachments, sounds, and movies. A document can define its own user interface; keyboard and mouse input can trigger actions that are specified by PDF objects. The document can contain interactive form fields to be filled in by the user, and can export the values of these fields to or import them from other applications.

Finally, a PDF document can contain higher-level information that is useful for interchange of content among applications. In addition to specifying appearance, a document’s content can include identification and logical structure information that allows it to be searched, edited, or extracted for reuse elsewhere. PDF is particularly well suited for representing a document as it moves through successive stages of a prepress production workflow.

full pdf overview
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

Saturday, October 27, 2007

pdf introduction

The Adobe Portable Document Format (PDF) is the native file format of the Adobe® Acrobat® family of products. The goal of these products is to enable users to exchange and view electronic documents easily and reliably, independently of the environment in which they were created. PDF relies on the same imaging model as the PostScript® page description language to describe text and graphics in a device-independent and resolution-independent manner. To improve performance
for interactive viewing, PDF defines a more structured format than that used by most PostScript language programs. PDF also includes objects, such as annotations and hypertext links, that are not part of the page itself but are useful for interactive viewing and document interchange.

1.1About This Book

This book provides a description of the PDF file format and is intended primarily for developers of PDF producer applications that create PDF files directly. It also contains enough information to allow developers to write PDF consumer applications that read existing PDF files and interpret or modify their contents.

Although the PDF Reference is independent of any particular software implementation, some PDF features are best explained by describing the way they are processed by a typical application program. In such cases, this book uses the Acrobat family of PDF viewer applications as its model. (The prototypical viewer is the fully capable Acrobat product, not the limited Adobe Reader® product.) Appendix C discusses some implementation limits in the Acrobat viewer applications, even though these limits are not part of the file format itself. Appendix H provides compatibility and implementation notes that describe how Acrobat viewers behave when they encounter newer features they do not understand and specify areas in which the Acrobat products diverge from the specification presented in this book. Implementors of PDF producer and consumer applications can use this information as guidance.

This edition of the PDF Reference describes version 1.7 of PDF. (See implementation note 1 in Appendix H.) Throughout the book, information specific to particular versions of PDF is marked with indicators such as (PDF 1.3) or (PDF 1.4). Features so marked may be new or substantially redefined in that version. Features designated (PDF 1.0) have generally been superseded in later versions; unless otherwise stated, features identified as specific to other versions are understood to be available in later versions as well. (PDF consumer applications designed for a specific PDF version generally ignore newer features they do not recognize; implementation notes in Appendix H point out exceptions.)

Note: In this edition, the term consumer is generally used to refer to PDF processing applications; viewer is reserved for applications that implement features that interact with users. This distinction is not always clear, however, since non-interactive applications may process objects in PDF documents (such as annotations) that represent interactive features.

The rest of the book is organized as follows:

•Chapter 2, "Overview," briefly introduces the overall architecture of PDF and the design considerations behind it, compares it with the PostScript language, and describes the underlying imaging model that they share.

•Chapter 3, "Syntax," presents the syntax of PDF at the object, file, and document level. It sets the stage for subsequent chapters, which describe how that information is interpreted as page descriptions, interactive navigational aids, and application-level logical structure.

•Chapter 4, "Graphics," describes the graphics operators used to describe the appearance of pages in a PDF document.

•Chapter 5, "Text," discusses PDF’s special facilities for presenting text in the form of character shapes, or glyphs, defined by fonts.

•Chapter 6, "Rendering," considers how device-independent content descriptions are matched to the characteristics of a particular output device.

•Chapter 7, "Transparency," discusses the operation of the transparent imaging model, introduced in PDF 1.4, in which objects can be painted with varying degrees of opacity, allowing the previous contents of the page to show through.

•Chapter 8, "Interactive Features," describes those features of PDF that allow a user to interact with a document on the screen by using the mouse and keyboard.

•Chapter 9, "Multimedia Features," describes those features of PDF that support embedding and playing multimedia content, including video, music and 3D artwork.

•Chapter 10, "Document Interchange," shows how PDF documents can incorporate higher-level information that is useful for the interchange of documents among applications.

•Appendix A, "Operator Summary," lists all the operators used in describing the visual content of a PDF document.

•Appendix B, "Operators in Type 4 Functions," summarizes the PostScript operators that can be used in PostScript calculator functions, which contain code written in a small subset of the PostScript language.

•Appendix C, "Implementation Limits," describes typical size and quantity limits imposed by the Acrobat viewer applications.

•Appendix D, "Character Sets and Encodings," lists the character sets and encodings that are assumed to be predefined in any PDF consumer application.

•Appendix E, "PDF Name Registry," discusses a registry, maintained for developers by Adobe Systems, that contains private names and formats used by PDF producers or Acrobat plug-in extensions.

•Appendix F, "Linearized PDF," describes a special form of PDF file organization designed to work efficiently in network environments.

•Appendix G, "Example PDF Files," presents several examples showing the structure of actual PDF files, ranging from one containing a minimal one-page document to one showing how the structure of a PDF file evolves over the course of several revisions.

•Appendix H, "Compatibility and Implementation Notes," provides details on the behavior of Acrobat viewer applications and describes how consumer applications should handle PDF files containing features that they do not recognize.

•Appendix I, "Computation of Object Digests," describes in detail an algorithm for calculating an object digest (discussed in Section 8.7, "Digital Signatures").

full pdf introduction
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

pdf origins

The origins of the PDF (Portable Document Format) and the Adobe® Acrobat® product family date to early 1990. At that time, the PostScript® page description language was rapidly becoming the worldwide standard for the production of the printed page. PDF builds on the PostScript page description language by layering a document structure and interactive navigation features on PostScript’s underlying imaging model, providing a convenient, efficient mechanism enabling documents to be reliably viewed and printed anywhere.

The PDF specification was first published at the same time the first Acrobat products were introduced in 1993. Since then, updated versions of the specification have been and continue to be available from Adobe on the World Wide Web. It includes the precise documentation of the underlying imaging model from PostScript along with the PDF-specific features that are combined in version 1.7 of the PDF standard.

Over the past eleven years, aided by the explosive growth of the Internet, PDF has become the de facto standard for the electronic exchange of documents. Well over 500 million copies of the free Adobe Reader® software have been distributed around the world, facilitating efficient sharing of digital content. In addition, PDF is now the industry standard for the intermediate representation of printed material in electronic prepress systems for conventional printing applications. As major corporations, government agencies, and educational institutions streamline their operations by replacing paper-based workflow with electronic exchange of information, the impact and opportunity for the application of PDF will continue to grow at a rapid pace.

PDF is the file format that underlies the Adobe® Intelligent Document Platform, facilitating the process of creating, managing, securing, collecting, and exchanging digital content on diverse platforms and devices. The Intelligent Document
Platform fulfills a set of requirements related to business process needs for the global desktop user, including:

•Preservation of document fidelity across the enterprise, independently of the device, platform, and software
•Merging of content from diverse sources—Web sites, word processing and spreadsheet programs, scanned documents, photos, and graphics—into one self-contained document while maintaining the integrity of all original source documents
•Real-time collaborative editing of documents from multiple locations or platforms
•Digital signatures to certify authenticity
•Security and permissions to allow the creator to retain control of the document and associated rights
•Accessibility of content to those with disabilities
•Extraction and reuse of content using other file formats and applications
•Electronic forms to gather data and integrate it with business systems.

The emergence of PDF as a standard for electronic information exchange is the result of concerted effort by many individuals in both the private and public sectors. Without the dedication of Adobe employees, our industry partners, and our customers, the widespread acceptance of PDF could not have been achieved. We thank all of you for your continuing support and creative contributions to the success of PDF.

full pdf origins
PDF Reference, Sixth Edition, version 1.7 (PDF, 31.0M)

Friday, October 26, 2007

pdf adobe creator

Adobe Reader PDF

What is PDF?
Adobe Portable Document Format PDF
Invented by Adobe Systems and perfected over 15 years, Adobe Portable Document Format (PDF) lets you capture and view robust information—from any application, on any computer system—and share it with anyone around the world. Individuals, businesses, and government agencies everywhere trust and rely on Adobe® PDF to communicate their ideas and vision.

Liberating information and the flow of ideas

Open format—De facto standard for more secure, dependable electronic information exchange—recognized by industries and governments around the world. Compliant with industry standards including PDF/A, PDF/X, and PDF/E.

As of January, 2007, Adobe is working with an ISO Technical Committee to submit PDF 1.7 to ISO for approval as a formal, open standard, named ISO 32000. ISO 32000 will be maintained and further developed by this technical committee with the objective of protecting the integrity and longevity of PDF. This will provide a formal, open standard for the billion+ PDF files in existence today.

Multiplatform — Viewable and printable on any platform — Macintosh, Microsoft® Windows®, UNIX®, and many mobile platforms.

Extensible — More than 1,800 vendors worldwide offer PDF-based solutions including creation, plug-in, consulting, training, and support tools.

Trusted and reliable — More than 200 million PDF documents on the web today serve as evidence of the number of organizations that rely on Adobe PDF to capture information.

Maintain information integrity — Adobe PDF files look exactly like original documents and preserve source file information — text, drawings, 3D, full-color graphics, photos, and even business logic — regardless of the application used to create them.

Keep information secure — Digitally sign or password-protect Adobe PDF documents created with Adobe Acrobat® 8 or Adobe LiveCycle™ software.

Searchable — Leverage full-text search features to locate words, bookmarks, and data fields in documents.

Accessible — Adobe PDF documents work with assistive technology to help make information accessible to people with disabilities.

Learn more
Adobe PDF Technology Center
Adobe Acrobat software development kit (SDK)
Adobe PDF Library SDK

Sunday, October 28, 2007

Saturday, October 27, 2007

Friday, October 26, 2007

Blog Archive