Concepts

We assume a cluster of nodes here, each running one or more containers. You want at any point in time to have insights what's going on on the nodes and in your containers and services:

Monitoring concepts

Suggest you start by reading James Turnbull's wonderful The Art of Monitoring book or, if you've only got 10 min, Netsil's Making Sense of the Application Monitoring Landscape.

Tooling

On each node, for example, use collectd and cAdvisor to scrape data locally and then there are multiple options for the other functionalities:

There are a couple of integrated or end-to-end solutions as well as fully managed offerings available as well:

For more hints on how to selecting tools and applying good practices check out the following resources:

There's an excellent conference on this topic, Monitorama, which you should totally keep an eye on.