Created on 2020-10-23 07:56
Published on 2020-10-23 07:58
All Applications that you write should have good logging. But what is good logging? Let’s start with a few No Brainers. Your applications uptime is always more important than your logging. If the data that you want to write is more important than it should not be in a log but in a database of some sort. Logging is only transient data.
We can define 3 types of logging:
Why Log Audit Logging ? Inevitably, someone asks why event data should be logged on a given system. Essentially there are four categories of reasons:
What to Log? Essentially, for each system monitored and likely event condition there must be enough data logged for determinations to be made. At a minimum, you need to be able to answer the standard who, what and when questions.
Both of these questions can be asked for a individual or in more general terms of how many did failed.
Retention Period of Audit logging. What is a normal time to keep audit logs , I think 90 days should be enough for problems to be noticed and researched, but every industry has their own needs so it could be a lot longer.
Which events to log
The level and content of security monitoring, alerting and reporting needs to be set during the requirements and design stage of projects, and should be proportionate to the information security risks. This can then be used to define what should be logged. There is no one size fits all solution, and a blind checklist approach can lead to unnecessary "alarm fog" that means real problems go undetected. Where possible, always log:
Optionally consider if the following events can be logged and whether it is desirable information:
log levels
INFO
WARN
All information about things that are going wrong but do not need intervention from humans. Something is not as it should be, but everything does work. For example, in case fall back content is send back.
ERROR
All information about things that are going wrong that do need human intervention. One or several sessions (users) are impacted. For example, a service call timed out or a page cannot be found.
DEBUG
All actions done by humans when using the application. Logging that should help a administrator to determine the cause of an error. All logging should be understandable and relevant for administrators.
TRACE
All actions done by the application. All other logging, which should be understandable and relevant for developers. Examples are method entries and exits, results returned from services and databases. This is the only level at which stack traces are allowed. A stack trace at any other level is a program error.
FATAL
Everyone is affected, the entire application is not working. For example, application properties are not present.
Retention What is a normal time to keep Application logs: 90 Days online & 1 year offline for all technical logs, but every industry has their own needs so it could be a lot longer.
An access log is a list of all the requests for individual files/endpoints that have been requested from a API. The access logs can offer a great deal of information regarding the incoming requests to your API If you need to analyse these logs in large amounts then it may be beneficial to use a log analysis tool that can “crunch the numbers” for you much faster. Example: 127.0.0.1 - peter [9/Feb/2017:10:34:12 -0700] "GET /sample-image.png HTTP/2" 200 1479
Retention
What is a normal time to keep Application logs: 90 Days online & 1 year offline for all technical logs, but every industry has their own needs so it could be a lot longer.
Conclusion
Whatever you do think about it in the design phase, do not make logging a after the fact exercise. Logging is too important to not think about. And to not push for as a Dev or OPS engineer you need to know what you application is doing so when that time comes and it does something you did not expect you can look in the logs and say that is why it went wrong, not mmmh i do not see anything.