你从什么时候开始错觉,认为terraform import并且没有任何plan差异=环境已经成功重现了?
首先
最近,Terraform v1.5.0版本已经发布。
1.5版本的亮点无疑是import模块和terraform plan -generate-config-out生成tf文件。这将使得导入现有资源变得随心所欲,引起了热议。
顺便说一下,作为Terraform的特点,通常会宣传“通过将基础设施进行编码,可以实现环境的再现”等等。此外,Terraform还具有“导入现有资源的功能”。虽然这两个特点单独来看都没有错,但是将它们结合起来,可以说“通过导入现有资源并且没有出现计划差异,可以实现原始环境的再现”吗?
很遗憾,实际情况并不完全如此。我认为可能有一些人凭经验会对此有所了解,但也有许多人可能没有意识到。为了验证这一点,让我们进行一次小实验吧。
环境
使用于验证的Terraform版本是验证时点的最新版本Terraform v1.5.0。
# terraform -v
Terraform v1.5.0
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v5.3.0
另外,为了确保验证日志的复制粘贴不会造成实际的损害,我在使用AWS的模拟工具localstack上运行的日志贴了上来。但是,这个问题本身不仅会在真实的AWS环境中出现,也是Terraform本身的机制导致的,所以不仅限于AWS,在其他提供商中也可能会出现这个问题。这只是一个例子。
实验
事前准备
在这里,我们将创建两个名为foo和bar的aws_security_group资源作为示例。
resource "aws_security_group" "foo" {
name = "foo"
}
resource "aws_security_group" "bar" {
name = "bar"
}
# terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with
the following symbols:
+ create
Terraform will perform the following actions:
# aws_security_group.bar will be created
+ resource "aws_security_group" "bar" {
+ arn = (known after apply)
+ description = "Managed by Terraform"
+ egress = (known after apply)
+ id = (known after apply)
+ ingress = (known after apply)
+ name = "bar"
+ name_prefix = (known after apply)
+ owner_id = (known after apply)
+ revoke_rules_on_delete = false
+ tags_all = (known after apply)
+ vpc_id = (known after apply)
}
# aws_security_group.foo will be created
+ resource "aws_security_group" "foo" {
+ arn = (known after apply)
+ description = "Managed by Terraform"
+ egress = (known after apply)
+ id = (known after apply)
+ ingress = (known after apply)
+ name = "foo"
+ name_prefix = (known after apply)
+ owner_id = (known after apply)
+ revoke_rules_on_delete = false
+ tags_all = (known after apply)
+ vpc_id = (known after apply)
}
Plan: 2 to add, 0 to change, 0 to destroy.
aws_security_group.bar: Creating...
aws_security_group.bar: Creation complete after 6s [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Creating...
aws_security_group.foo: Creation complete after 0s [id=sg-23ffbb130d22cec58]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
# terraform state list
aws_security_group.bar
aws_security_group.foo
资源已被创建。
尝试声明rm和导入
我只删除了 foo 的部分,然后尝试导入它。
# terraform state rm aws_security_group.foo
Removed aws_security_group.foo
Successfully removed 1 resource instance(s).
# terraform state list
aws_security_group.bar
既然如此,让我们尝试一下v1.5版本的新功能import block。导入所需的id可以在资源创建时的apply日志中找到,请各位自行替换阅读。
import {
id = "sg-23ffbb130d22cec58"
to = aws_security_group.foo
}
resource "aws_security_group" "bar" {
name = "bar"
}
用import模块替换aws_security_group.foo的resource模块,然后通过执行plan -generate-config-out命令生成resource模块。
# terraform plan -generate-config-out=generated.tf
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
Terraform will perform the following actions:
# aws_security_group.foo will be imported
# (config will be generated)
resource "aws_security_group" "foo" {
arn = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
description = "Managed by Terraform"
egress = []
id = "sg-23ffbb130d22cec58"
ingress = []
name = "foo"
owner_id = "000000000000"
tags = {}
tags_all = {}
vpc_id = "vpc-5d80b3ff"
}
Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.
╷
│ Warning: Config generation is experimental
│
│ Generating configuration during import is currently experimental, and the generated configuration format may change
│ in future versions.
╵
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform has generated configuration and written it to generated.tf. Please review the configuration and edit it as
necessary before adding it to version control.
Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if
you run "terraform apply" now.
生成了以下的tf文件。
# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.
# __generated__ by Terraform from "sg-23ffbb130d22cec58"
resource "aws_security_group" "foo" {
description = "Managed by Terraform"
egress = []
ingress = []
name = "foo"
name_prefix = null
revoke_rules_on_delete = null
tags = {}
tags_all = {}
vpc_id = "vpc-5d80b3ff"
}
属性值为[], null, {}这样的零值感觉有点冗长呢。
虽然如果查一下就能知道哪些属性是必需的,但毕竟如果不够,就会被验证错误抱怨,所以暂时我们可以把它们都注释掉试试看。
# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.
# __generated__ by Terraform from "sg-23ffbb130d22cec58"
resource "aws_security_group" "foo" {
# description = "Managed by Terraform"
# egress = []
# ingress = []
# name = "foo"
# name_prefix = null
# revoke_rules_on_delete = null
# tags = {}
# tags_all = {}
# vpc_id = "vpc-5d80b3ff"
}
这样的感觉,我先试着执行一下计划。
# terraform plan
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
Terraform will perform the following actions:
# aws_security_group.foo will be imported
resource "aws_security_group" "foo" {
arn = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
description = "Managed by Terraform"
egress = []
id = "sg-23ffbb130d22cec58"
ingress = []
name = "foo"
owner_id = "000000000000"
tags = {}
tags_all = {}
vpc_id = "vpc-5d80b3ff"
}
Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if
you run "terraform apply" now.
噢,好像忘记了在最初创建时指定的name = “foo”的设定,但似乎可以无需差异导入。
虽然有点不好的预感,但还是尝试导入吧。
# terraform apply -auto-approve
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
Terraform will perform the following actions:
# aws_security_group.foo will be imported
resource "aws_security_group" "foo" {
arn = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
description = "Managed by Terraform"
egress = []
id = "sg-23ffbb130d22cec58"
ingress = []
name = "foo"
owner_id = "000000000000"
tags = {}
tags_all = {}
vpc_id = "vpc-5d80b3ff"
}
Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.
aws_security_group.foo: Importing... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Import complete [id=sg-23ffbb130d22cec58]
Apply complete! Resources: 1 imported, 0 added, 0 changed, 0 destroyed.
我无意间在没有规划的情况下成功导入了它。
资源再生产
在这种情况下,重新生成foo资源会发生什么?可以尝试使用terraform apply -replace来重新创建。
为了帮助不熟悉-replace的人,我补充说明一下,它实际上是将特定的资源块先注释掉,然后再进行apply,再取消注释并重新apply以重新生成资源,这与一次性地销毁和创建是相同的。
# terraform apply -replace=aws_security_group.foo
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with
the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
# aws_security_group.foo will be replaced, as requested
-/+ resource "aws_security_group" "foo" {
~ arn = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58" -> (known after apply)
~ egress = [] -> (known after apply)
~ id = "sg-23ffbb130d22cec58" -> (known after apply)
~ ingress = [] -> (known after apply)
~ name = "foo" -> (known after apply)
+ name_prefix = (known after apply)
~ owner_id = "000000000000" -> (known after apply)
+ revoke_rules_on_delete = false
- tags = {} -> null
~ tags_all = {} -> (known after apply)
~ vpc_id = "vpc-5d80b3ff" -> (known after apply)
# (1 unchanged attribute hidden)
}
Plan: 1 to add, 0 to change, 1 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_security_group.foo: Destroying... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Destruction complete after 0s
aws_security_group.foo: Creating...
aws_security_group.foo: Creation complete after 1s [id=sg-ad987d40c176f1729]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
# terraform state list
aws_security_group.bar
aws_security_group.foo
资源已重新生成。
为了进行结果比较,我们将再次从tfstate文件中删除foo,并尝试以相同方式重新导入。
# terraform state rm aws_security_group.foo
Removed aws_security_group.foo
Successfully removed 1 resource instance(s).
一旦存储先前生成的generated.tf文件,重新生成资源并更改了其ID,因此需要更新import区块中的ID。
# mv generated.tf generated.tf.bk
import {
id = "sg-ad987d40c176f1729"
to = aws_security_group.foo
}
resource "aws_security_group" "bar" {
name = "bar"
}
再次创建foo的资源块。
# terraform plan -generate-config-out=generated.tf
aws_security_group.foo: Preparing import... [id=sg-ad987d40c176f1729]
aws_security_group.foo: Refreshing state... [id=sg-ad987d40c176f1729]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Warning: Config generation is experimental
│
│ Generating configuration during import is currently experimental, and the generated configuration format may change
│ in future versions.
╵
╷
│ Error: Conflicting configuration arguments
│
│ with aws_security_group.foo,
│ on generated.tf line 4:
│ (source code not available)
│
│ "name": conflicts with name_prefix
╵
╷
│ Error: Conflicting configuration arguments
│
│ with aws_security_group.foo,
│ on generated.tf line 5:
│ (source code not available)
│
│ "name_prefix": conflicts with name
╵
这次发生了名字和名字前缀冲突的错误,请看一下生成的tf文件内容,应该是这样的。
# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.
# __generated__ by Terraform
resource "aws_security_group" "foo" {
description = "Managed by Terraform"
egress = []
ingress = []
name = "terraform-20230613142830207200000001"
name_prefix = "terraform-"
revoke_rules_on_delete = null
tags = {}
tags_all = {}
vpc_id = "vpc-5d80b3ff"
}
与之前进行预先准备创建的资源定义相比较,我们可以确认”name=foo”变成了”name=terraform-20230613142830207200000001″。
考察 -> 调查研究
在这个测试中,为了简化起见,我们使用了Terraform来创建资源,但在实际用例中,我们假设某人创建了一些屏幕上的东西,然后将其导入到Terraform的管理下。所以,即使在初始构建阶段指定了名称,仅通过导入并确认计划差异不存在,也不能完全再现资源,因为名称可能已经更改了。
为了避免误解,我想补充一下,我并不是说我们不应该使用不完整的import,相反,我认为我们应该尽可能地import手动创建的资源,以提高覆盖率。但是我们应该意识到,仅仅通过确认import和plan之间没有差异,并不能完全复现环境的限制。
如果重视资源的可再现性,那么实际重新生成资源是最可靠的方法。但是由于实际情况不容易轻松重新生成许多资源,因此最好能够通过类型定义等方法来推测。如果省略了一些属性,是由提供者填充默认值,还是由云端API填充,还是零值应该如何解释,为了抑制表达的差异等等,都需要根据资源类型进行个别具体的提供者实现和云端API的依赖,因此可以说如果省略了可选属性,可能无法实现以上所述的更一般化的情况,这似乎是困难的。
即使是Optional的属性,一旦被写入tfstate,如果在tf文件中对其进行更改或删除,通常会被检测出为计划差异。换个角度看,忽略Optional的属性意味着对该设置没有兴趣,即使它的值变成其他内容也无所谓。因此,在明确设定属性值的范围内,可以说它按照声明的方式进行了。换句话说,需要声明想要复现的范围。这是可以理解的。
如果重视尽可能复现环境的现状,那么在填充所有可选属性的值方面更安全,terraform plan -generate-config-out的输出也是如此。不过,对于接手维护工作的人来说,枚举没有特定意图的默认值会带来非常大的认知负担。虽然在这个例子中只有几个属性,但根据资源类型的不同,有些资源可能有数十个属性,而且当导入多个资源时,行数还会增加。读者在阅读的过程中,需要考虑每个值是默认值还是其他资源相关联,从被埋没在参数堆中的值中解读关系非常费劲。如果有可能,导入资源时删除默认值,后来的读者将会更加愉快。
总结
使用Terraform导入现有资源并检查计划差异不一定意味着环境可以被完全复制。省略可选属性意味着对该配置不感兴趣,不会关心其数值的变化。声明的范围内将按照声明的方式进行。重要的是要明确要复制哪个范围。不要将Terraform配置仅视为参数表,而是将其视为设计文档,明确配置值的意图。如果理解各个属性的含义,并且明确指定它们的值有意义,即使是默认值,也应将其显示出来。对于无特殊要求,接受默认值的部分可以省略。然而,对于从他人继承而不清楚意图的内容,应该先接受现状,或者如果可能的话,查找供应商实现或API的默认值,并省略那些看起来没有从默认值进行更改的部分,以使后续使用者感到满意。
我感觉好像只是在说一些理所当然的事情,但我觉得很多人可能会以概念层面简单地粗略理解没有计划差异=能够重现环境的情况,所以我故意写下来。