Skip to Content
For Enterprise
For Government
For Higher Ed
For Individuals
For Marketing
For Enterprise
For Government
For Higher Ed
For Individuals
For Marketing
Explore Skills
Cloud Computing
Microsoft Azure
Amazon Web Services (AWS)
Google Cloud
Cloud Migration
Cloud Deployment
Cloud Platforms
Data Engineering
Data Warehouse
SQL
Apache Spark
Microsoft SQL Server
MySQL
Kafka
Data Lake
Streaming & Messaging
NoSQL Databases
Relational Databases
Data Science
Pandas
R
MATLAB
SAS
D3
Power BI
Tableau
Statistics
Exploratory Data Analysis
Data Visualization
AI & ML
Generative AI
Machine Learning
Artificial Intelligence (AI)
Deep Learning
Reinforcement Learning
Natural Language Processing
TensorFlow
Scikit-Learn
Hyperparameter Tuning
MLOps
Programming Languages
Java
JavaScript
Spring
Python
Go
C#
C++
C
Swift
Rust
Functional Programming
Software Architecture
Object-Oriented
Distributed Systems
Domain-Driven Design
Architectural Patterns
IT/Ops
Kubernetes
Docker
GitHub
Terraform
Continuous Delivery
Continuous Integration
Database Administration
Computer Networking
Operating Systems
IT Certifications
Security
Network Security
Application Security
Incident Response
Zero Trust Model
Disaster Recovery
Penetration Testing / Ethical Hacking
Governance
Malware
Security Architecture
Security Engineering
Security Certifications
Design
Web Design
Graphic Design
Interaction Design
Film & Video
User Experience (UX)
Design Process
Design Tools
Business
Agile
Project Management
Product Management
Marketing
Human Resources
Finance
Team Management
Business Strategy
Digital Transformation
Organizational Leadership
Soft Skills
Professional Communication
Emotional Intelligence
Presentation Skills
Innovation
Critical Thinking
Public Speaking
Collaboration
Personal Productivity
Confidence / Motivation
Features
All features
Verifiable skills
AI Academy
Courses
Certifications
Interactive learning
Live events
Superstreams
Answers
Insights reporting
Radar Blog
Buy Courses
Plans
Sign In
Try Now
O'Reilly Platform
book
监控运维实践:原则与策略
by
Mike Julian
November 2020
Intermediate to advanced
142 pages
3h 12m
Chinese
Posts & Telecom Press
Content preview from
监控运维实践:原则与策略
88
|
第
8
章
#
当前时间(纪元时间)
current_time=$(date +%s)
#
距离启动开关还有多少时间
timeleft=$((current_time - last_touch))
if
[ $timeleft -gt $TIME_LIMIT ];
then
echo "Dead man's switch activated: job failed!"
fi
这个脚本使用起来也很简单:把代码放入独立的
cron
作业中,每分钟运行一次,然后将之
前的备份作业修改成这样:
run-backup.sh && touch deadman.dat
现在,如果状态文件的修改时间超过一定时长,就会自动触发失能开关。
我必须提醒你,这只是一个初级的实现,还有很大的改进空间,但核心思想没有问题。
好消息是有些托管服务可以做到这一点,而不需要你自己来设计。通过
Google
搜索
cron
作业监控,你会找到很多选项。
8.13
记录日志
日志可以分成
3
个独立的问题来看待:日志采集、日志存储和日志分析。
8.13.1
采集
我喜欢将日志的位置分为两组:
syslog
中的日志和其他日志。
如果日志已经由
syslog
守护进程处理,那么只需配置该守护进程来执行将日志转发到另一
个服务器这一操作。请参考
syslog
守护进程文档,了解具体的操作方法。
syslog
转发
:
UDP
与
TCP
关于使用
UDP
还是
TCP
来转发
syslog
,目前还存在争论。从
UDP
方面讲,由于不需
要确认,你可以在服务器崩溃之前发送服务器的“最后一口气” ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial
You might also like
Java数据科学实战
Michael R. Brzustowicz, PhD
量子计算机编程:从入门到实践
Eric R. Johnston, Nicholas Harrigan, Mercedes Gimeno-Segovia
深度學習|內行人的做法
Josh Patterson, Adam Gibson
Python实用技能学习指南
Posts & Telecom Press, Robert Smallshire, Austin Bingham
Publisher Resources
ISBN: 9787115550750