23320 Slices
Medium 9780596001322

16. Simple Databases

Tom Phoenix O'Reilly Media ePub

Databases permit us to allow data to persist beyond the end of our program. The kinds of databases we're talking about in this chapter are merely simple ones; how to use full-featured database implementations (Oracle, Sybase, Informix, mySQL, and others) is a topic that could fill an entire book, and usually does. The databases in this chapter are those that are simple enough to implement that you don't need to know about modules to use them.[1]

Every system that has Perl also has a simple database already available in the form of DBM files. This lets your program store data for quick lookup in a file or in a pair of files. When two files are used, one holds the data and the other holds a table of contents, but you don't need to know that in order to use DBM files. We're intentionally being a little vague about the exact implementation, because that will vary depending upon your machine and configuration; see the AnyDBM_file manpage for more information. Also, among the downloadable files from the O'Reilly website is a utility called which_dbm, which tries to tell you which implementation you're using, how many files there are, and what extensions they use, if any.

See All Chapters
Medium 9780596154509

Tips, Traps, and Tracebacks

Mitchell L Model O'Reilly Media ePub

Entrez Programming Utilities

The Entrez Programming Utilities (E-Utilities) provide uniform access to many of the Entrez databases. They are accessed through HTTP queries, but their parameters and responses are designed for programmatic use, as opposed to HTML forms and web pages marked up with formatting details. One of the query parameters specifies the desired form of output: text, XML, etc. The starting point for their documentation is http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. That page provides links to the documentation of each of the tools and a short course describing how to use them to construct data pipelines (http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=coursework&part=eutils). Perhaps the most useful query in the context of the kinds of data access shown in this book is Efetch, specifically in its use with Sequence and other Molecular Biology Databases, as described at http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html. A complete list of the software tools available from the NCBI website is at http://www.ncbi.nlm.nih.gov/guide/data-software.

See All Chapters
Medium 9781449380373

13. Using Ruby Third-Party Libraries

Matt Aimonetti O'Reilly Media ePub

In Chapter12, you saw how to write or include Objective-C libraries in MacRuby apps using frameworks or dynamic libraries. This is very useful for existing Cocoa code or low-level Objective-C wrappers. However, the amount of free open source Ruby libraries is quite impressive. As a matter of fact, there are currently more published Ruby libraries than Perl libraries! This chapter explains how to access these Ruby resources.

Ruby libraries are usually packaged as gems, which are library packages used by the RubyGems standard library. A gem includes its own library files, defining a version number and dependencies on other libraries, if any. You can look for gems at the RubyGems site. In C Ruby, the default Ruby implementation, use the gem command-line tool to install gems on your system. In MacRuby, the gem command line is prefixed to avoid conflicting with the C Ruby command. Very much like irb is available as macirb, gem for MacRuby is available as macgem.

See All Chapters
Medium 9780596102371

10. Momentum Conservation: What Newton Did

Heather Lang O'Reilly Media ePub

No one likes to be a pushover. So far, you’ve learned to deal with objects that are already moving. But what makes them go in the first place? You know that something will move if you push it - but how will it move? In this chapter, you’ll overcome inertia as you get acquainted with some of Newton’s Laws. You’ll also learn about momentum, why it’s conserved, and how you can use it to solve problems.

The pirate captain is being chased across the seas by a ghost ship and needs to make sure it keeps its distance.

His ship’s fitted with some Sieges-R-Us battle cannons. The captain wants to know the maximum range of his cannons - the maximum horizontal distance he can fire a cannonball, and you’ve been called in as the expert.

But the supply of cannonballs is limited at sea - so he won’t actually be able to fire a cannon until you’ve got it all worked out.

Time to get the hang of what the cannonball’s doing. Your job is to sketch graphs that show how the horizontal and vertical components of the displacement, velocity and acceleration change with time. Think about it one component at a time, and do the easier graphs first!

See All Chapters
Medium 9781491905777

4. Common Developer Tasks for Impala

John Russell O'Reilly Media ePub

Here are the special Impala aspects of some standard operations familiar to database developers.

Because Impala’s feature set is oriented toward high-performance queries, much of the data you work with in Impala will originate from some other source, and Impala takes over near the end of the extract-transform-load (ETL) pipeline.

To get data into an Impala table, you can point Impala at data files in an arbitrary HDFS location; move data files from somewhere in HDFS into an Impala-managed directory; or copy data from one Impala table to another. Impala can query the original raw data files, without requiring any conversion or reorganization. Impala can also assist with converting and reorganizing data when those changes are helpful for query performance.

As a developer, you might be setting up all parts of a data pipeline, or you might work with files that already exist. Either way, the last few steps in the pipeline are the most important ones from the Impala perspective. You want the data files to go into a well-understood and predictable location in HDFS, and then Impala can work with them.

See All Chapters

See All Slices