Accumulo: Application Development, Table Design, and Best Practices

Accumulo: Application Development, Table Design, and Best Practices

Aaron Cordova, Billie Rinaldi, Michael Wall

Language: English

Pages: 552

ISBN: 1449374182

Format: PDF / Kindle (mobi) / ePub

Accumulo: Application Development, Table Design, and Best Practices

Aaron Cordova, Billie Rinaldi, Michael Wall

Language: English

Pages: 552

ISBN: 1449374182

Format: PDF / Kindle (mobi) / ePub


Get up to speed on Apache Accumulo, the flexible, high-performance key/value store created by the National Security Agency (Nsa) and based on Google’s BigTable data storage system. Written by former Nsa team members, this comprehensive tutorial and reference covers Accumulo architecture, application development, table design, and cell-level security.

With clear information on system administration, performance tuning, and best practices, this book is ideal for developers seeking to write Accumulo applications, administrators charged with installing and maintaining Accumulo, and other professionals interested in what Accumulo has to offer. You will find everything you need to use this system fully.

  • Get a high-level introduction to Accumulo’s architecture and data model
  • Take a rapid tour through single- and multiple-node installations, data ingest, and query
  • Learn how to write Accumulo applications for several use cases, based on examples
  • Dive into Accumulo internals, including information not available in the documentation
  • Get detailed information for installing, administering, tuning, and measuring performance
  • Learn best practices based on successful implementations in the field
  • Find answers to common questions that every new Accumulo user asks

Architectural Design - New Health Facilities

Writing About Architecture: Mastering the Language of Buildings and Cities (Architecture Briefs)

Guerilla Furniture Design: How to Build Lean, Modern Furniture with Salvaged Materials

Design School Wisdom: Make First, Stay Awake, and Other Essential Lessons for Work and Life

The Management of Construction: A Project Lifecycle Approach

 

 

 

 

 

 

 

 

 

 

 

results 0.14 1) query global index 0.02 1976233182 Query completed. We’ll try adding another search term, sea: TEXT == 'old' and TEXT == 'man' and TEXT == 'sea' This returns the results in Figure 8-4. Figure 8-4. Refined search results This cuts down the matching entries to only 339: HTML query: TEXT == 'old' and TEXT == 'man' and TEXT == 'sea' Connecting to [instanceName = koverse, zookeepers = koversevm:2181, username = root]. 339 matching entries found in optimized query. Designing Row

time series, secondary indexes, and complex text search can all benefit from using batch scanners. See Figure 1-23. Figure 1-23. Scanning a batch of rows More detail on developing applications using Accumulo’s API is found in the chapters beginning with Chapter 3. Approach to Rows Accumulo takes a slightly different approach to rows in the client API than do some other implementations based on Bigtable, such as HBase. Accumulo’s read API is designed to stream key-value pairs to the client

through the iterator. If no issues are observed, an iterator can also be applied at minor and major compaction times, making the changes permanent on disk. In the shell this would be done like this: user@accumulo> setiter -class com.company.MyIterator -n myiterator -minc -majc \ -t myTable -p 40 If issues are observed, an iterator can be disabled in the shell via the deleteiter command: user@accumulo> deleteiter -n myiterator -minc -majc -scan -t myTable In particular, shutting down a system

performance, because a single tablet server process is limited in some ways. Accumulo is designed to scale horizontally, meaning adding more servers rather than increasing the resources of each server. Storage Devices Unlike some databases, Accumulo is designed to keep most of the data managed on disk. As much data as will fit is cached into RAM as data is read from disk, but even reads that request data that is not cached in RAM are designed to be fast, because Accumulo minimizes disk seeks by

middle and compares that to the key it’s looking for, and based on this comparison it decides in which direction it must continue searching. This continues until the computer finds an exact match or determines that the key sought is not in the list. Figure 1-3. An example of binary search This dramatically reduces the number of keys that must be examined and makes searching for a particular key faster. How much faster? If it takes 10 milliseconds to fetch and examine one key, finding a

Download sample

Download