Documents In World History Stearns Pdf Reader

by Mike Driscollintermediate

When comparing Airmail vs Spark, the Slant community recommends Airmail for most people.In the question“What are the best e-mail clients for macOS?”Airmail is ranked 2nd while Spark is ranked 5th. The most important reason people chose Airmail is. Feb 18, 2020 Spark has this 'Smart Inbox' feature that separates mail into categories: Personal, Notifications, Newsletters, Pinned, and Seen. That is, any email that is from someone in your contacts or otherwise looks like a personal email will be filtered to the top of the inbox list.

Jan 19, 2020 Is Airmail a good match for you too? Quite possibly. In this Airmail review, I’ll explore the app’s features so you can make up your own mind. Airmail FAQs What is Airmail? Airmail is an attractive, affordable, easy to use, and very fast email app for Mac. Jan 15, 2020 Or, like Airmail, schedule your email to be sent at a later time. Also like Airmail, Spark allows you to postpone an email so you can deal with it later and works together with other apps, though not as many as Airmail. Get Spark (Mac App Store) Breaking news: I’ve just come across a new fast and simple email client for Mac that is now in Beta. Jan 24, 2020 Spark and Airmail are two of the most feature-rich email apps for the iOS ecosystem. Both apps have focused on the shortcoming of the default iOS Mail app to improve the user experience.

Documents In World History Stearns
Documents In World History Stearns Pdf Reader Download

Feb 06, 2020 Determine how you will be using PDF files. For complex editing and manipulation of PDF files, you will need to purchase Adobe Acrobat. If you just want to be able to open or export existing files as PDF files, then there are free options. Acrobat Reader, Foxit Reader, or Windows Reader App are a few free options for viewing.pdf files. This way, the bank application will no longer open PDF files by default. The next time you come across a PDF document, just select the “Drive PDF Viewer” option, which is native to Android, or another app of your choice and press “Always”. Take advantage of the tips and solve the PDF problem by opening in the bank’s app.

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: How to Work With a PDF in Python

The Portable Document Format, or PDF, is a file format that can be used to present and exchange documents reliably across operating systems. While the PDF was originally invented by Adobe, it is now an open standard that is maintained by the International Organization for Standardization (ISO). You can work with a preexisting PDF in Python by using the PyPDF2 package.

PyPDF2 is a pure-Python package that you can use for many different types of PDF operations.

By the end of this article, you’ll know how to do the following:

Extract document information from a PDF in Python
Rotate pages
Merge PDFs
Split PDFs
Add watermarks
Encrypt a PDF

Let’s get started!

Free Download:Get a sample chapter from Python Tricks: The Book that shows you Python's best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

History of `pyPdf`, `PyPDF2`, and `PyPDF4`#

The original pyPdf package was released way back in 2005. The last official release of pyPdf was in 2010. After a lapse of around a year, a company called Phasit sponsored a fork of pyPdf called PyPDF2. The code was written to be backwards compatible with the original and worked quite well for several years, with its last release being in 2016.

There was a brief series of releases of a package called PyPDF3, and then the project was renamed to PyPDF4. All of these projects do pretty much the same thing, but the biggest difference between pyPdf and PyPDF2+ is that the latter versions added Python 3 support. There is a different Python 3 fork of the original pyPdf for Python 3, but that one has not been maintained for many years.

While PyPDF2 was recently abandoned, the new PyPDF4 does not have full backwards compatibility with PyPDF2. Most of the examples in this article will work perfectly fine with PyPDF4, but there are some that cannot, which is why PyPDF4 is not featured more heavily in this article. Feel free to swap out the imports for PyPDF2 with PyPDF4 and see how it works for you.

`pdfrw`: An Alternative#

Patrick Maupin created a package called pdfrw that can do many of the same things that PyPDF2 does. You can use pdfrw for all of the same sorts of tasks that you will learn how to do in this article for PyPDF2, with the notable exception of encryption.

The biggest difference when it comes to pdfrw is that it integrates with the ReportLab package so that you can take a preexisting PDF and build a new one with ReportLab using some or all of the preexisting PDF.

Installation#

Installing PyPDF2 can be done with pip or conda if you happen to be using Anaconda instead of regular Python.

Here’s how you would install PyPDF2 with pip:

The install is quite quick as PyPDF2 does not have any dependencies. You will likely spend as much time downloading the package as you will installing it.

Now let’s move on and learn how to extract some information from a PDF.

How to Extract Document Information From a PDF in Python#

You can use PyPDF2 to extract metadata and some text from a PDF. This can be useful when you’re doing certain types of automation on your preexisting PDF files.

Here are the current types of data that can be extracted:

Author
Creator
Producer
Subject
Title
Number of pages

You need to go find a PDF to use for this example. You can use any PDF you have handy on your machine. To make things easy, I went to Leanpub and grabbed a sample of one of my books for this exercise. The sample you want to download is called reportlab-sample.pdf.

Let’s write some code using that PDF and learn how you can get access to these attributes:

Here you importPdfFileReader from the PyPDF2 package. The PdfFileReader is a class with several methods for interacting with PDF files. In this example, you call .getDocumentInfo(), which will return an instance of DocumentInformation. This contains most of the information that you’re interested in. You also call .getNumPages() on the reader object, which returns the number of pages in the document.

Note: That last code block uses Python 3’s new f-strings for string formatting. If you’d like to learn more, you can check out Python 3’s f-Strings: An Improved String Formatting Syntax (Guide).

The information variable has several instance attributes that you can use to get the rest of the metadata you want from the document. You print out that information and also return it for potential future use.

While PyPDF2 has .extractText(), which can be used on its page objects (not shown in this example), it does not work very well. Some PDFs will return text and some will return an empty string. When you want to extract text from a PDF, you should check out the PDFMiner project instead. PDFMiner is much more robust and was specifically designed for extracting text from PDFs.

Now you’re ready to learn about rotating PDF pages.

How to Rotate Pages#

Occasionally, you will receive PDFs that contain pages that are in landscape mode instead of portrait mode. Or perhaps they are even upside down. This can happen when someone scans a document to PDF or email. You could print the document out and read the paper version or you can use the power of Python to rotate the offending pages.

For this example, you can go and pick out a Real Python article and print it to PDF.

Let’s learn how to rotate a few of the pages of that article with PyPDF2:

For this example, you need to import the PdfFileWriter in addition to PdfFileReader because you will need to write out a new PDF. rotate_pages() takes in the path to the PDF that you want to modify. Within that function, you will need to create a writer object that you can name pdf_writer and a reader object called pdf_reader.

Next, you can use .GetPage() to get the desired page. Here you grab page zero, which is the first page. Then you call the page object’s .rotateClockwise() method and pass in 90 degrees. Then for page two, you call .rotateCounterClockwise() and pass it 90 degrees as well.

Note: The PyPDF2 package only allows you to rotate a page in increments of 90 degrees. You will receive an AssertionError otherwise.

After each call to the rotation methods, you call .addPage(). This will add the rotated version of the page to the writer object. The last page that you add to the writer object is page 3 without any rotation done to it.

Finally you write out the new PDF using .write(). It takes a file-like object as its parameter. This new PDF will contain three pages. The first two will be rotated in opposite directions of each other and be in landscape while the third page is a normal page.

Now let’s learn how you can merge multiple PDFs into one.

How to Merge PDFs#

There are many situations where you will want to take two or more PDFs and merge them together into a single PDF. For example, you might have a standard cover page that needs to go on to many types of reports. You can use Python to help you do that sort of thing.

For this example, you can open up a PDF and print a page out as a separate PDF. Then do that again, but with a different page. That will give you a couple of inputs to use for example purposes.

Let’s go ahead and write some code that you can use to merge PDFs together:

You can use merge_pdfs() when you have a list of PDFs that you want to merge together. You will also need to know where to save the result, so this function takes a list of input paths and an output path.

Then you loop over the inputs and create a PDF reader object for each of them. Next you will iterate over all the pages in the PDF file and use .addPage() to add each of those pages to itself.

Once you’re finished iterating over all of the pages of all of the PDFs in your list, you will write out the result at the end.

One item I would like to point out is that you could enhance this script a bit by adding in a range of pages to be added if you didn’t want to merge all the pages of each PDF. If you’d like a challenge, you could also create a command line interface for this function using Python’s argparse module.

Let’s find out how to do the opposite of merging!

How to Split PDFs#

There are times where you might have a PDF that you need to split up into multiple PDFs. This is especially true of PDFs that contain a lot of scanned-in content, but there are a plethora of good reasons for wanting to split a PDF.

Here’s how you can use PyPDF2 to split your PDF into multiple files:

In this example, you once again create a PDF reader object and loop over its pages. For each page in the PDF, you will create a new PDF writer instance and add a single page to it. Then you will write that page out to a uniquely named file. When the script is finished running, you should have each page of the original PDF split into separate PDFs.

Now let’s take a moment to learn how you can add a watermark to your PDF.

How to Add Watermarks#

Watermarks are identifying images or patterns on printed and digital documents. Some watermarks can only be seen in special lighting conditions. The reason watermarking is important is that it allows you to protect your intellectual property, such as your images or PDFs. Another term for watermark is overlay.

You can use Python and PyPDF2 to watermark your documents. You need to have a PDF that only contains your watermark image or text.

Let’s learn how to add a watermark now:

create_watermark() accepts three arguments:

input_pdf: the PDF file path to be watermarked
output: the path you want to save the watermarked version of the PDF
watermark: a PDF that contains your watermark image or text

In the code, you open up the watermark PDF and grab just the first page from the document as that is where your watermark should reside. Then you create a PDF reader object using the input_pdf and a generic pdf_writer object for writing out the watermarked PDF.

The next step is to iterate over the pages in the input_pdf. This is where the magic happens. You will need to call .mergePage() and pass it the watermark_page. When you do that, it will overlay the watermark_page on top of the current page. Then you add that newly merged page to your pdf_writer object.

Finally, you write the newly watermarked PDF out to disk, and you’re done!

Documents in world history stearns pdf reader free

The last topic you will learn about is how PyPDF2 handles encryption.

How to Encrypt a PDF#

PyPDF2 currently only supports adding a user password and an owner password to a preexisting PDF. In PDF land, an owner password will basically give you administrator privileges over the PDF and allow you to set permissions on the document. On the other hand, the user password just allows you to open the document.

As far as I can tell, PyPDF2 doesn’t actually allow you to set any permissions on the document even though it does allow you to set the owner password.

Regardless, this is how you can add a password, which will also inherently encrypt the PDF:

add_encryption() takes in the input and output PDF paths as well as the password that you want to add to the PDF. It then opens a PDF writer and a reader object, as before. Since you will want to encrypt the entire input PDF, you will need to loop over all of its pages and add them to the writer.

The final step is to call .encrypt(), which takes the user password, the owner password, and whether or not 128-bit encryption should be added. The default is for 128-bit encryption to be turned on. If you set it to False, then 40-bit encryption will be applied instead.

Note: PDF encryption uses either RC4 or AES (Advanced Encryption Standard) to encrypt the PDF according to pdflib.com.

Just because you have encrypted your PDF does not mean it is necessarily secure. There are tools to remove passwords from PDFs. If you’d like to learn more, Carnegie Mellon University has an interesting paper on the topic.

Conclusion#

The PyPDF2 package is quite useful and is usually pretty fast. You can use PyPDF2 to automate large jobs and leverage its capabilities to help you do your job better!

In this tutorial, you learned how to do the following:

Extract metadata from a PDF
Rotate pages
Merge and split PDFs
Add watermarks
Add encryption

Also keep an eye on the newer PyPDF4 package as it will likely replace PyPDF2 soon. You might also want to check out pdfrw, which can do many of the same things that PyPDF2 can do.

Master Real-World Python Skills With Unlimited Access to Real Python

Already a member? Sign-In

Join us and get access to hundreds of tutorials, hands-on video courses, and a community of expert Pythonistas:

Pageof 10

The 24 Elders - The First World

Holyone Tombari Dodoh | History
Rating: Rated: 0 times
Format: PDF, ePub, Kindle, TXT

The 24 Elders is a historical event adapted from Revelation (KJV) which brings to light the stories of the past and foretells the future. It houses the origin of the Creator, the 24 Elders, Seraphim and Cherubim, the Archangels, Angels, universes, Stars, Planets and all living things. Hakkadosh..

The Nagorno-Karabakh Conflict Between Armenia and Azerbaijan the Roots of Problem and Prospects for a Settlement

Ramiz Mehdiyev | History
Rating: Rated: 0 times
Format: PDF

The Nagorno-Karabakh Conflict Between Armenia and Azerbaijan the Roots of the Problem and Prospects for a Settlement

Documents In World History Stearns

Pogroms And Other Atrocities

Bassam Imam | History
Rating: Rated: 0 times
Format: PDF, ePub, Kindle, TXT

Until the end of the Second World War, Jews had historically endured countless pogroms, acts of gross injustice, and a lesser number of expulsions throughout Europe. Persecution and isolation were commonly the norm; the Jews were Europe's punching bag (Historical Anti-Semitism). Superstition..

Bible Of The Freeborn American Patriot Book 2

H.L. Dowless | History
Rating: Rated: 0 times
Format: PDF, ePub, Kindle, TXT

Book one carries researchers and readers from the earliest beginnings of America, down through the Buchanan Administration, through the bombardment of the ship, Star Of The West. This specific intentional violation of presidential order for the ship to hold was the true reason for the US Civil War..

Bible Of The Freeborn American Patriot

H.L. Dowless | History
Rating: Rated: 1 times
Format: PDF, ePub, Kindle, TXT

Read this work to discover the absolute truth in regard to American history that your Government has struggled to withhold from public possession. Find out how and why America was ever discovered to begin with. Learn what the true reasons are that the colonists revolted against Great Britain..

The Path of Splitness

Indrek Pringi | History
Rating: Rated: 1 times
Format: PDF, TXT

The Path of Splitness is a major non-fiction work that will rock the scientific world It is 2,766 pages: This is the latest revised version. The book analyzes and explains the basic pre-history of the Universe and how it came into being, the basic Dynamics which created Life, the basic..

Sports Scandals and Crimes in the United States (selected cases)

Michael Erbschloe | History
Rating: Rated: 1 times
Format: PDF, ePub, Kindle, TXT

Baseball has been cracking down on steroid use with more frequent and random testing, but that hasn’t stopped the problem. After all, A-Rod’s suspension comes on the heels of former National League MVP Ryan Braun’s. Why do the big stars keep risking their careers and reputations for drugs..

The Ugliness of White Supremacy Extremists: Field Notes from 2019

Documents In World History Stearns Pdf Reader Download

Michael Erbschloe | History
Rating: Rated: 1 times
Format: PDF, ePub, Kindle, TXT

Violent White Supremacist Extremists (WSE) are defined as individuals who seek, wholly or in part, through unlawful acts of force or violence, to support their belief in the intellectual and moral superiority of the white race over other races. This book presents testimony, commentaries, and facts..

Modern Cases of Espionage in the United States (1975 – 2008)

Michael Erbschloe | History
Rating: Rated: 1 times
Format: PDF, ePub, Kindle, TXT

On June 15, 1917, just two months after the United States entered World War I, Congress adopted the Espionage Act. The act, which was meant to define the act of espionage during wartime, put new limits to Americans’ First Amendment rights. The Espionage Act gave the federal government increased..

Office of the Inspector General: Review of Seven Offices

Michael Erbschloe | History
Rating: Rated: 2 times
Format: PDF, ePub, Kindle, TXT

Prior to the establishment of the CIGIE, the Federal Inspectors General operated under the auspices of two councils, The President's Council on the Integrity and Efficiency (PCIE) and the Executive Council on the Integrity and Efficiency (ECIE). The Council of the Inspectors General on Integrity..

Page

History of pyPdf, PyPDF2, and PyPDF4#

pdfrw: An Alternative#

Installation#

How to Extract Document Information From a PDF in Python#

How to Rotate Pages#

How to Merge PDFs#

How to Split PDFs#

How to Add Watermarks#

How to Encrypt a PDF#

Conclusion#

Further Reading#

Master Real-World Python Skills With Unlimited Access to Real Python

The 24 Elders - The First World

The Nagorno-Karabakh Conflict Between Armenia and Azerbaijan the Roots of Problem and Prospects for a Settlement

Documents In World History Stearns

Pogroms And Other Atrocities

Bible Of The Freeborn American Patriot Book 2

Bible Of The Freeborn American Patriot

The Path of Splitness

Sports Scandals and Crimes in the United States (selected cases)

The Ugliness of White Supremacy Extremists: Field Notes from 2019

Documents In World History Stearns Pdf Reader Download

Modern Cases of Espionage in the United States (1975 – 2008)

Office of the Inspector General: Review of Seven Offices

History of `pyPdf`, `PyPDF2`, and `PyPDF4`#

`pdfrw`: An Alternative#