Black belt Elasticsearch

Some more advanced Elasticsearch wisdom I gleaned from Jason Wong and Mark Laney from Elastic. Contents Environment with Config X-Pack Security (the 1337 way) Roles Built-in Query Web UI (batteries included) Internals Lucene Segments Elasticsearch Indexing Transaction Log and Flushing Doc Values Caching Field Modelling Typing Denormalising Range Types Mapping Parameters Fixing Data Painless Reindexing API’s Picking up Mapping Changes Multi-fields Custom Marker (flag) Field Fixing Fields Advanced Search and Aggregations Patterns Wildcard Query Regexp Qury Null Script (painless) Query Script Field Performance Considerations Search Templates Aggregations Percentile Top Hits Scripted (painless) Aggregations Significant Terms Aggregation Pipeline Aggregations Cluster Management Dedicated Nodes Hot Warm Architecture Tags Verify Shard Allocation Forced Awareness Capacity Planning Shard Allocation Litmus Test Primary Shards Scaling with Indices Scaling with Replicas Resources Time Based Data API’s for Managing Indices Document Modelling Nested Objects Nested Aggregations Parent Child Relationships Argh Which Technique is Best?
Read more →

Elasticsearch Basics

Some Elasticsearch wisdom I gleaned from Jason Wong and Mark Laney from Elastic. Contents Use cases Log stash vs Beats? Time Series vs Static Data Logstash Installation Starting and Stopping Elasticsearch Killing Communication Discovery module (networking) Security Read-only Enabling X-Pack (Elasticsearch Security) CRUD Ingestion Reading Search Query and Filter Contexts Mapping Inverted Index Multi Fields (keyword fields) Anatomy of an Analyzer Custom Analyzer The reindex API Node Types Cluster state Shards Anatomy of Search (Shards) Troubleshooting Configuration Responses Cluster and Shard Health Diagnosing Issues Improving Search Results Multi-field Search Boosting Fuzziness Exact Terms Sorting Paging Highlighting Aggregations Best Practices Index Aliases Index Templates Scroll Search Cluster Backup Use cases Search Logging Metrics - unlike logs, are typically not in a text format.
Read more →

Logstash

A quick walkthrough of Logstash, the ETL engine offered by the Elastic Stack. Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite stash Logstash gained its initial popularity with log and metric collection, such as log4j logs, Apache web logs and syslog. Its application has broadened, to all kinds of data sources like large scale event streams, webhooks, database and message queue integration.
Read more →