你从什么时候开始错觉,认为terraform import并且没有任何plan差异=环境已经成功重现了?

首先

最近,Terraform v1.5.0版本已经发布。

 

1.5版本的亮点无疑是import模块和terraform plan -generate-config-out生成tf文件。这将使得导入现有资源变得随心所欲,引起了热议。

顺便说一下,作为Terraform的特点,通常会宣传“通过将基础设施进行编码,可以实现环境的再现”等等。此外,Terraform还具有“导入现有资源的功能”。虽然这两个特点单独来看都没有错,但是将它们结合起来,可以说“通过导入现有资源并且没有出现计划差异,可以实现原始环境的再现”吗?

很遗憾,实际情况并不完全如此。我认为可能有一些人凭经验会对此有所了解,但也有许多人可能没有意识到。为了验证这一点,让我们进行一次小实验吧。

环境

使用于验证的Terraform版本是验证时点的最新版本Terraform v1.5.0。

# terraform -v
Terraform v1.5.0
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v5.3.0

另外,为了确保验证日志的复制粘贴不会造成实际的损害,我在使用AWS的模拟工具localstack上运行的日志贴了上来。但是,这个问题本身不仅会在真实的AWS环境中出现,也是Terraform本身的机制导致的,所以不仅限于AWS,在其他提供商中也可能会出现这个问题。这只是一个例子。

实验

事前准备

在这里,我们将创建两个名为foo和bar的aws_security_group资源作为示例。

resource "aws_security_group" "foo" {
  name = "foo"
}

resource "aws_security_group" "bar" {
  name = "bar"
}
# terraform apply -auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with
the following symbols:
  + create

Terraform will perform the following actions:

  # aws_security_group.bar will be created
  + resource "aws_security_group" "bar" {
      + arn                    = (known after apply)
      + description            = "Managed by Terraform"
      + egress                 = (known after apply)
      + id                     = (known after apply)
      + ingress                = (known after apply)
      + name                   = "bar"
      + name_prefix            = (known after apply)
      + owner_id               = (known after apply)
      + revoke_rules_on_delete = false
      + tags_all               = (known after apply)
      + vpc_id                 = (known after apply)
    }

  # aws_security_group.foo will be created
  + resource "aws_security_group" "foo" {
      + arn                    = (known after apply)
      + description            = "Managed by Terraform"
      + egress                 = (known after apply)
      + id                     = (known after apply)
      + ingress                = (known after apply)
      + name                   = "foo"
      + name_prefix            = (known after apply)
      + owner_id               = (known after apply)
      + revoke_rules_on_delete = false
      + tags_all               = (known after apply)
      + vpc_id                 = (known after apply)
    }

Plan: 2 to add, 0 to change, 0 to destroy.
aws_security_group.bar: Creating...
aws_security_group.bar: Creation complete after 6s [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Creating...
aws_security_group.foo: Creation complete after 0s [id=sg-23ffbb130d22cec58]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
# terraform state list
aws_security_group.bar
aws_security_group.foo

资源已被创建。

尝试声明rm和导入

我只删除了 foo 的部分,然后尝试导入它。

# terraform state rm aws_security_group.foo
Removed aws_security_group.foo
Successfully removed 1 resource instance(s).

# terraform state list
aws_security_group.bar

既然如此,让我们尝试一下v1.5版本的新功能import block。导入所需的id可以在资源创建时的apply日志中找到,请各位自行替换阅读。

import {
  id = "sg-23ffbb130d22cec58"
  to = aws_security_group.foo
}

resource "aws_security_group" "bar" {
  name = "bar"
}

用import模块替换aws_security_group.foo的resource模块,然后通过执行plan -generate-config-out命令生成resource模块。

# terraform plan -generate-config-out=generated.tf
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]

Terraform will perform the following actions:

  # aws_security_group.foo will be imported
  # (config will be generated)
    resource "aws_security_group" "foo" {
        arn         = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
        description = "Managed by Terraform"
        egress      = []
        id          = "sg-23ffbb130d22cec58"
        ingress     = []
        name        = "foo"
        owner_id    = "000000000000"
        tags        = {}
        tags_all    = {}
        vpc_id      = "vpc-5d80b3ff"
    }

Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.
╷
│ Warning: Config generation is experimental
│
│ Generating configuration during import is currently experimental, and the generated configuration format may change
│ in future versions.
╵

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform has generated configuration and written it to generated.tf. Please review the configuration and edit it as
necessary before adding it to version control.

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if
you run "terraform apply" now.

生成了以下的tf文件。

# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.

# __generated__ by Terraform from "sg-23ffbb130d22cec58"
resource "aws_security_group" "foo" {
  description            = "Managed by Terraform"
  egress                 = []
  ingress                = []
  name                   = "foo"
  name_prefix            = null
  revoke_rules_on_delete = null
  tags                   = {}
  tags_all               = {}
  vpc_id                 = "vpc-5d80b3ff"
}

属性值为[], null, {}这样的零值感觉有点冗长呢。
虽然如果查一下就能知道哪些属性是必需的,但毕竟如果不够,就会被验证错误抱怨,所以暂时我们可以把它们都注释掉试试看。

# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.

# __generated__ by Terraform from "sg-23ffbb130d22cec58"
resource "aws_security_group" "foo" {
  # description = "Managed by Terraform"
  # egress                 = []
  # ingress                = []
  # name = "foo"
  # name_prefix            = null
  # revoke_rules_on_delete = null
  # tags                   = {}
  # tags_all               = {}
  # vpc_id                 = "vpc-5d80b3ff"
}

这样的感觉,我先试着执行一下计划。

# terraform plan
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]

Terraform will perform the following actions:

  # aws_security_group.foo will be imported
    resource "aws_security_group" "foo" {
        arn         = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
        description = "Managed by Terraform"
        egress      = []
        id          = "sg-23ffbb130d22cec58"
        ingress     = []
        name        = "foo"
        owner_id    = "000000000000"
        tags        = {}
        tags_all    = {}
        vpc_id      = "vpc-5d80b3ff"
    }

Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if
you run "terraform apply" now.

噢,好像忘记了在最初创建时指定的name = “foo”的设定,但似乎可以无需差异导入。
虽然有点不好的预感,但还是尝试导入吧。

# terraform apply -auto-approve
aws_security_group.foo: Preparing import... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]

Terraform will perform the following actions:

  # aws_security_group.foo will be imported
    resource "aws_security_group" "foo" {
        arn         = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58"
        description = "Managed by Terraform"
        egress      = []
        id          = "sg-23ffbb130d22cec58"
        ingress     = []
        name        = "foo"
        owner_id    = "000000000000"
        tags        = {}
        tags_all    = {}
        vpc_id      = "vpc-5d80b3ff"
    }

Plan: 1 to import, 0 to add, 0 to change, 0 to destroy.
aws_security_group.foo: Importing... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Import complete [id=sg-23ffbb130d22cec58]

Apply complete! Resources: 1 imported, 0 added, 0 changed, 0 destroyed.

我无意间在没有规划的情况下成功导入了它。

资源再生产

在这种情况下,重新生成foo资源会发生什么?可以尝试使用terraform apply -replace来重新创建。
为了帮助不熟悉-replace的人,我补充说明一下,它实际上是将特定的资源块先注释掉,然后再进行apply,再取消注释并重新apply以重新生成资源,这与一次性地销毁和创建是相同的。

# terraform apply -replace=aws_security_group.foo
aws_security_group.foo: Refreshing state... [id=sg-23ffbb130d22cec58]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with
the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # aws_security_group.foo will be replaced, as requested
-/+ resource "aws_security_group" "foo" {
      ~ arn                    = "arn:aws:ec2:ap-northeast-1:000000000000:security-group/sg-23ffbb130d22cec58" -> (known after apply)
      ~ egress                 = [] -> (known after apply)
      ~ id                     = "sg-23ffbb130d22cec58" -> (known after apply)
      ~ ingress                = [] -> (known after apply)
      ~ name                   = "foo" -> (known after apply)
      + name_prefix            = (known after apply)
      ~ owner_id               = "000000000000" -> (known after apply)
      + revoke_rules_on_delete = false
      - tags                   = {} -> null
      ~ tags_all               = {} -> (known after apply)
      ~ vpc_id                 = "vpc-5d80b3ff" -> (known after apply)
        # (1 unchanged attribute hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_security_group.foo: Destroying... [id=sg-23ffbb130d22cec58]
aws_security_group.foo: Destruction complete after 0s
aws_security_group.foo: Creating...
aws_security_group.foo: Creation complete after 1s [id=sg-ad987d40c176f1729]

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
# terraform state list
aws_security_group.bar
aws_security_group.foo

资源已重新生成。
为了进行结果比较,我们将再次从tfstate文件中删除foo,并尝试以相同方式重新导入。

# terraform state rm aws_security_group.foo
Removed aws_security_group.foo
Successfully removed 1 resource instance(s).

一旦存储先前生成的generated.tf文件,重新生成资源并更改了其ID,因此需要更新import区块中的ID。

# mv generated.tf generated.tf.bk
import {
  id = "sg-ad987d40c176f1729"
  to = aws_security_group.foo
}

resource "aws_security_group" "bar" {
  name = "bar"
}

再次创建foo的资源块。

# terraform plan -generate-config-out=generated.tf
aws_security_group.foo: Preparing import... [id=sg-ad987d40c176f1729]
aws_security_group.foo: Refreshing state... [id=sg-ad987d40c176f1729]
aws_security_group.bar: Refreshing state... [id=sg-6c44c1c79799ce7b2]

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Warning: Config generation is experimental
│
│ Generating configuration during import is currently experimental, and the generated configuration format may change
│ in future versions.
╵
╷
│ Error: Conflicting configuration arguments
│
│   with aws_security_group.foo,
│   on generated.tf line 4:
│   (source code not available)
│
│ "name": conflicts with name_prefix
╵
╷
│ Error: Conflicting configuration arguments
│
│   with aws_security_group.foo,
│   on generated.tf line 5:
│   (source code not available)
│
│ "name_prefix": conflicts with name
╵

这次发生了名字和名字前缀冲突的错误,请看一下生成的tf文件内容,应该是这样的。

# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.

# __generated__ by Terraform
resource "aws_security_group" "foo" {
  description            = "Managed by Terraform"
  egress                 = []
  ingress                = []
  name                   = "terraform-20230613142830207200000001"
  name_prefix            = "terraform-"
  revoke_rules_on_delete = null
  tags                   = {}
  tags_all               = {}
  vpc_id                 = "vpc-5d80b3ff"
}

与之前进行预先准备创建的资源定义相比较,我们可以确认”name=foo”变成了”name=terraform-20230613142830207200000001″。

考察 -> 调查研究

在这个测试中,为了简化起见,我们使用了Terraform来创建资源,但在实际用例中,我们假设某人创建了一些屏幕上的东西,然后将其导入到Terraform的管理下。所以,即使在初始构建阶段指定了名称,仅通过导入并确认计划差异不存在,也不能完全再现资源,因为名称可能已经更改了。

为了避免误解,我想补充一下,我并不是说我们不应该使用不完整的import,相反,我认为我们应该尽可能地import手动创建的资源,以提高覆盖率。但是我们应该意识到,仅仅通过确认import和plan之间没有差异,并不能完全复现环境的限制。

如果重视资源的可再现性,那么实际重新生成资源是最可靠的方法。但是由于实际情况不容易轻松重新生成许多资源,因此最好能够通过类型定义等方法来推测。如果省略了一些属性,是由提供者填充默认值,还是由云端API填充,还是零值应该如何解释,为了抑制表达的差异等等,都需要根据资源类型进行个别具体的提供者实现和云端API的依赖,因此可以说如果省略了可选属性,可能无法实现以上所述的更一般化的情况,这似乎是困难的。

即使是Optional的属性,一旦被写入tfstate,如果在tf文件中对其进行更改或删除,通常会被检测出为计划差异。换个角度看,忽略Optional的属性意味着对该设置没有兴趣,即使它的值变成其他内容也无所谓。因此,在明确设定属性值的范围内,可以说它按照声明的方式进行了。换句话说,需要声明想要复现的范围。这是可以理解的。

如果重视尽可能复现环境的现状,那么在填充所有可选属性的值方面更安全,terraform plan -generate-config-out的输出也是如此。不过,对于接手维护工作的人来说,枚举没有特定意图的默认值会带来非常大的认知负担。虽然在这个例子中只有几个属性,但根据资源类型的不同,有些资源可能有数十个属性,而且当导入多个资源时,行数还会增加。读者在阅读的过程中,需要考虑每个值是默认值还是其他资源相关联,从被埋没在参数堆中的值中解读关系非常费劲。如果有可能,导入资源时删除默认值,后来的读者将会更加愉快。

总结

使用Terraform导入现有资源并检查计划差异不一定意味着环境可以被完全复制。省略可选属性意味着对该配置不感兴趣,不会关心其数值的变化。声明的范围内将按照声明的方式进行。重要的是要明确要复制哪个范围。不要将Terraform配置仅视为参数表,而是将其视为设计文档,明确配置值的意图。如果理解各个属性的含义,并且明确指定它们的值有意义,即使是默认值,也应将其显示出来。对于无特殊要求,接受默认值的部分可以省略。然而,对于从他人继承而不清楚意图的内容,应该先接受现状,或者如果可能的话,查找供应商实现或API的默认值,并省略那些看起来没有从默认值进行更改的部分,以使后续使用者感到满意。

我感觉好像只是在说一些理所当然的事情,但我觉得很多人可能会以概念层面简单地粗略理解没有计划差异=能够重现环境的情况,所以我故意写下来。

广告
将在 10 秒后关闭
bannerAds