NetworkManager strongSwan encryption algorithm ‘DES-CBC’ not supported

Recently we ran into an issue involving NetworkManager and strongSwan. The error in the systemd journal was a cryptic “encryption algorithm ‘DES-CBC’ not supported”, as shown in the following log excerpt:

Jul 19 19:14:00 el-valle NetworkManager[733]:  [1532042040.2613] audit: op="connection-activate" uuid="26f20e51-92ba-4a78-a1>
[...]
Jul 19 19:14:00 el-valle NetworkManager[733]:  [1532042040.2764] vpn-connection[0x56050004c1f0,26f20e51-92ba-4a78-a17e-1709b>
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[ASN] encryption algorithm 'DES-CBC' not supported
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[LIB] building CRED_PRIVATE_KEY - RSA failed, tried 8 builders
Jul 19 19:14:00 el-valle NetworkManager[733]:  [1532042040.2862] vpn-connection[0x56050004c1f0,26f20e51-92ba-4a78-a17e-1709b>
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[CFG] received initiate for NetworkManager connection Acme strongSwan
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[CFG] using CA certificate, gateway identity 'vpn.acme.com'
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[ASN] encryption algorithm 'DES-CBC' not supported
Jul 19 19:14:00 el-valle charon-nm[17026]: 05[LIB] building CRED_PRIVATE_KEY - ANY failed, tried 7 builders

In the end we tracked this down to strongSwan being unable to read a private key that had been encrypted with DES. The solution was to re-encrypt the private key using AES-256:

shell$ sudo openssl rsa -in client_key.pem -aes256 -out newkey.pem
Enter pass phrase for client_key.pem:
writing RSA key
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
shell$ sudo mv newkey.pem client_key.pem

The following post was helpful to figure out what was happening:

https://lists.strongswan.org/pipermail/users/2017-June/011088.html

Site-to-site VPN Between Cisco ASA/FTD and strongSwan

I recently wasted about two days to bring up a simple site-to-site IPsec VPN tunnel between a Cisco ASA and Cisco FTD and a Linux machine running strongSwan and using digital certificates to authenticate the peers. The configuration was simple but due to a little “detail” and to a lack of good debugging information on the Cisco ASA/FTD, what should have been a five-minute job ended up taking a couple of days of troubleshooting, looking at the strongSwan source code, and making configuration changes to try to make it work. In the end I was able to bring up the tunnel and I got to the bottom of what the Cisco ASA/FTD was not liking from what strongSwan was sending. I will document here the configurations, and finally, at the end, will show what the Cisco ASA/FTD was choking on.

Cisco ASA Configuration

The basic VPN configuration on the Cisco ASA side looks like this:

access-list traffic-to-encrypt extended permit ip 10.123.0.0 255.255.255.0 10.123.1.0 255.255.255.0 
!
crypto ipsec ikev2 ipsec-proposal IPSEC-PROPOSAL
 protocol esp encryption aes-256
 protocol esp integrity sha-256 sha-1
!
crypto map MYMAP 10 match address traffic-to-encrypt
crypto map MYMAP 10 set peer 10.118.57.149 
crypto map MYMAP 10 set ikev2 ipsec-proposal IPSEC-PROPOSAL
crypto map MYMAP 10 set trustpoint TRUSTPOINT chain
!
crypto map MYMAP interface outside
!
crypto ca trustpoint TRUSTPOINT
 revocation-check crl
 keypair TRUSTPOINT
 crl configure
 policy static
 url 1 http://www.chapus.net/ChapulandiaCA.crl
!
crypto ikev2 policy 10
 encryption aes-256
 integrity sha256
 group 14
 prf sha
 lifetime seconds 86400
!
crypto ikev2 enable outside
!
tunnel-group 10.118.57.149 type ipsec-l2l
tunnel-group 10.118.57.149 ipsec-attributes
 ikev2 remote-authentication certificate
 ikev2 local-authentication certificate TRUSTPOINT

Note that the FTD configuration is very similar, but it has to be performed via the Firepower Management Center (FMC) GUI. In fact, after doing the configuration via FMC one can log into the FTD CLI using SSH and run the command “show running-config” and see the same configuration shown above for the ASA.

strongSwan Configuration (ipsec.conf)

The ipsec.conf configuration file (typically located at /etc/ipsec.conf) is the old way of configuring the strongSwan IPsec subsystem. The following ipsec.conf file contents allowed the tunnel to come up with no problems:

config setup
	strictcrlpolicy=yes
	cachecrls = yes

ca MyCA
	crluri = http://www.example.com/MyCA.crl
	cacert = ca.pem
	auto = add

conn %default
	ikelifetime=60m
	keylife=20m
	rekeymargin=3m
	keyingtries=1
	keyexchange=ikev2
	mobike=no

conn net-net
	leftcert=rpi.pem
	leftsubnet=10.123.1.0/24
	leftfirewall=yes
	right=10.122.109.113
	rightid="C=US, ST=CA, L=SF, O=Acme, OU=CSS, CN=asa, E=admin@example.com"
	rightsubnet=10.123.0.0/24
	auto=add

In addition to the ipsec.conf file, the ipsec.secrets (typically /etc/ipsec.secrets) also has to be edited, in this case to indicate the name of the private RSA key. Our ipsec.secrets file looks like this:

# ipsec.secrets - strongSwan IPsec secrets file

: RSA mykey.pem

Finally, certain certificates and the RSA key must be placed (all in PEM format) in certain directories under /etc/ipsec.d:

The Linux machine’s identity certificate goes into /etc/ipsec.d/cert/. strongSwan automatically loads that certificate upon startup
The Certification Authority (CA) root certificate goes into /etc/ipsec.d/cacerts/
The private key must be placed in /etc/ipsec.d/private/

strongSwan Configuration (swanctl.conf)

swanctl.conf is a new configuration file that is used by the swanctl(8) tool to load configurations and credentials into the strongSwan IKE daemon. This is the “new” way to configure the strongSwan IPsec subsystem. The configuration file syntax is very different, though the parameters that need to be set to be able to bring up the IPsec tunnel are the same as in the case of the ipsec.conf-based configuration.

A swanctl.conf-based configuration is more modular. Configuration files typically exist under /etc/swanctl/. For our specific connection, we put the configuration in the file /etc/swanctl/conf.d/example.conf, which gets included from /etc/swanctl/swanctl.conf. Our /etc/swanctl/example.conf file contains the following:

connections {

    # Section for an IKE connection named .
    my-connection {
        # IKE major version to use for connection.
        version = 2

        # Remote address(es) to use for IKE communication, comma separated.
        # remote_addrs = %any
	remote_addrs = 10.122.109.113

        # Section for a local authentication round.
        local-1 {
            # Comma separated list of certificate candidates to use for
            # authentication.
            certs = rpi.pem
        }

        children {

            # CHILD_SA configuration sub-section.
            my-connection {
                # Local traffic selectors to include in CHILD_SA.
                # local_ts = dynamic
                local_ts = 10.123.1.0/24

                # Remote selectors to include in CHILD_SA.
                # remote_ts = dynamic
		remote_ts = 10.123.0.0/24
            }
        }
    }
}

# Section defining secrets for IKE/EAP/XAuth authentication and private key
# decryption.
secrets {
    # Private key decryption passphrase for a key in the private folder.
    private-rpikey {
        # File name in the private folder for which this passphrase should be
        # used.
        file = rpi.pem

        # Value of decryption passphrase for private key.
        # secret =
    }
}

# Section defining attributes of certification authorities.
authorities {
    # Section defining a certification authority with a unique name.
    MyA {
        # CA certificate belonging to the certification authority.
        cacert = myca.pem

        # Comma-separated list of CRL distribution points.
        crl_uris = http://www.chapus.net/ChapulandiaCA.crl
    }
}

Bringing Up the Tunnel on Interesting Traffic

To bring up the tunnel when “interesting” traffic is received it is necessary to use the “start_action” configuration parameter. Otherwise the IPsec tunnel has to be brought up manually using the swanctl –initiate xxxxx command.

Here’s an example configuration that uses “start_action”:

connections {

# Section for an IKE connection named <conn>.
 lab-vpn {
 version = 2

 remote_addrs = 10.1.10.114

 local-1 {
 certs = rpi.pem
 }

children {
 # CHILD_SA configuration sub-section.
 lab-vpn {
 local_ts = 10.123.1.0/24, 10.10.0.0/16
 remote_ts = 10.123.0.0/24

start_action = trap
 }
 }
 }

}

 

Issues

There were three serious issues that I ran into when trying to bring up the site to site tunnel. All of them appear to be bugs.

Cisco ASA/FTD Unable to Process Downloaded CRL When Cisco WSA in the Middle

In this issue the Cisco ASA/FTD is apparently unable to parse a downloaded CRL when a Cisco WSA proxy server is transparently in the middle. The Cisco WSA is returning the file to the Cisco ASA/FTD but the ASA apparently does not like something in the HTTP headers (the “Via” header? I don’t know). There is nothing wrong with the CRL itself — I performed a packet capture on the ASA itself, extracted the CRL file from the packet capture, and it is not corrupted or anything. In fact, I have seen the revocation check work sometimes; I believe the problem occurs when the CRL is present in the WSA’s cache, which would explain why it works sometimes. I configured the web server hosting the CRL to prevent caching of the file but the problem still persists.

Workaround for this problem: Configure the ASA to fallback to no revocation check, i.e.

crypto ca truspoint X
 revocation-check crl none

PRF Algorithms Other Than SHA1 Do Not Work

No idea if the problem here is on the Cisco ASA/FTD side or on the strongSwan side. All I know is that strongSwan fails to authenticate the peer. I see these messages in the strongSwan logs:

[ENC] parsed IKE_AUTH response 1 [ V IDr CERT AUTH SA TSi TSr N(ESP_TFC_PAD_N) N(NON_FIRST_FRAG) N(MOBIKE_SUP) ]
[IKE] received end entity cert ""
[CFG]   using certificate ""
[CFG]   using trusted ca certificate ""
[CFG] checking certificate status of ""
[CFG]   using trusted certificate ""
[CFG]   crl correctly signed by ""
[CFG]   crl is valid: until Jan 06 02:12:01 2018
[CFG]   using cached crl
[CFG] certificate status is good
[CFG]   reached self-signed root ca with a path length of 0
[IKE] signature validation failed, looking for another key
[CFG]   using certificate ""
[CFG]   using trusted ca certificate ""
[CFG] checking certificate status of ""
[CFG]   using trusted certificate ""
[CFG]   crl correctly signed by ""
[CFG]   crl is valid: until Jan 06 02:12:01 2018
[CFG]   using cached crl
[CFG] certificate status is good
[CFG]   reached self-signed root ca with a path length of 0
[IKE] signature validation failed, looking for another key
[ENC] generating INFORMATIONAL request 2 [ N(AUTH_FAILED) ]
[NET] sending packet: from 10.118.57.151[4500] to 10.122.109.113[4500] (80 bytes)
initiate failed: establishing CHILD_SA 'css-lab' failed

Workaround for this problem: Use SHA-1 as the PRF. For example, on the ASA, one could use:

crypto ikev2 policy 10
 encryption aes-256
 integrity sha256
 group 19
 prf sha

Certificates Using ASN.1 “PRINTABLESTRING” Don’t Work on Cisco ASA/FTD

This one was very difficult to troubleshoot. It might be a bug on the strongSwan side but I am not sure. The issue is that, depending on configuration, strongSwan will use as IKEv2 identity to send to the Cisco ASA/FTD a Distinguished Name (DN) in binary ASN.1 encoding, but when it creates this binary ASN.1 encoding it will use the type “PRINTABLESTRING” instead of “UTF8STRING” to represent fields like Country, stateOrProvince, localityName, organizationName, commonName, etc. The IKEv2 identity is otherwise identical to the identity that strongSwan would obtain directly from the certificate.

On the ASA/FTD side, when the ASA/FTD receives an identity that uses fields of type “PRINTABLESTRING” it seems to consider the identity bad, and it chokes. This is made difficult to troubleshoot by the fact that there apparently are no good debug messages to see what is going on. On a bad case one sees these messages:

%ASA-7-711001: IKEv2-PLAT-3: RECV PKT [IKE_AUTH] [10.118.57.149]:500->[10.122.109.113]:500 InitSPI=0x596a08fccb72412a RespSPI=0x5d757649514ab5e8 MID=00000001
%ASA-7-711001: (34):  
%ASA-7-711001: IKEv2-PROTO-2: (34): Received Packet [From 10.118.57.149:500/To 10.122.109.113:500/VRF i0:f0] 
[...]
%ASA-7-711001:  IDr%ASA-7-711001:   Next payload: CERT, reserved: 0x0, length: 128
%ASA-7-711001:     Id type: DER ASN1 DN, Reserved: 0x0 0x0
%ASA-7-711001: 
%ASA-7-711001:      30 76 31 0b 30 09 06 03 55 04 06 13 02 55 53 31
%ASA-7-711001:      0b 30 09 06 03 55 04 08 13 02 4e 43 31 0c 30 0a
%ASA-7-711001:      06 03 55 04 07 13 03 52 54 50 31 0e 30 0c 06 03
%ASA-7-711001:      55 04 0a 13 05 43 69 73 63 6f 31 0c 30 0a 06 03
%ASA-7-711001:      55 04 0b 13 03 43 53 53 31 0c 30 0a 06 03 55 04
%ASA-7-711001:      03 13 03 72 70 69 31 20 30 1e 06 09 2a 86 48 86
%ASA-7-711001:      f7 0d 01 09 01 16 11 65 6c 70 61 72 69 73 40 63
%ASA-7-711001:      69 73 63 6f 2e 63 6f 6d
[...]
%ASA-7-711001: IKEv2-PROTO-5: (34): SM Trace-> SA: I_SPI=596A08FCCB72412A R_SPI=5D757649514AB5E8 (I) MsgID = 00000001 CurState: I_WAIT_AUTH Event: EV_RECV_AUTH
%ASA-7-711001: IKEv2-PROTO-5: (34): Action: Action_Null
%ASA-7-711001: IKEv2-PROTO-5: (34): SM Trace-> SA: I_SPI=596A08FCCB72412A R_SPI=5D757649514AB5E8 (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_CHK4_NOTIFY
%ASA-7-711001: IKEv2-PROTO-2: (34): Process auth response notify
%ASA-7-711001: IKEv2-PROTO-5: (34): SM Trace-> SA: I_SPI=596A08FCCB72412A R_SPI=5D757649514AB5E8 (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_PROC_MSG
%ASA-7-711001: IKEv2-PLAT-2: (34): peer auth method set to: 1
%ASA-7-711001: IKEv2-PROTO-5: (34): SM Trace-> SA: I_SPI=596A08FCCB72412A R_SPI=5D757649514AB5E8 (I) MsgID = 00000001 CurState: I_WAIT_AUTH Event: EV_RE_XMT
%ASA-7-711001: IKEv2-PROTO-2: (34): Retransmitting packet
%ASA-7-711001: (34):  
%ASA-7-711001: IKEv2-PROTO-2: (34): Sending Packet [To 10.118.57.149:500/From 10.122.109.113:500/VRF i0:f0] 

As can be seen, the state machine goes from I_WAIT_AUTH (wait for authentication payload) to I_PROC_AUTH (process authentication payload), receives an “EV_PROC_MSG” (process message event), and then goes back to the I_WAIT_AUTH state with a retransmit (EV_RE_XMT) event. There is not explanation or message that indicates why process the IKEv2 identity failed.

In the good case, when see messages like:

%ASA-7-711001: IKEv2-PROTO-5: (35): SM Trace-> SA: I_SPI=03542332E12C42F4 R_SPI=3E95373B6C8C25AF (I) MsgID = 00000001 CurState: I_WAIT_AUTH Event: EV_RECV_AUTH
%ASA-7-711001: IKEv2-PROTO-5: (35): Action: Action_Null
%ASA-7-711001: IKEv2-PROTO-5: (35): SM Trace-> SA: I_SPI=03542332E12C42F4 R_SPI=3E95373B6C8C25AF (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_CHK4_NOTIFY
%ASA-7-711001: IKEv2-PROTO-2: (35): Process auth response notify
%ASA-7-711001: IKEv2-PROTO-5: (35): SM Trace-> SA: I_SPI=03542332E12C42F4 R_SPI=3E95373B6C8C25AF (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_PROC_MSG
%ASA-7-711001: IKEv2-PLAT-2: (35): peer auth method set to: 1
%ASA-7-711001: IKEv2-PROTO-5: (35): SM Trace-> SA: I_SPI=03542332E12C42F4 R_SPI=3E95373B6C8C25AF (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_CHK_IF_PEER_CERT_NEEDS_TO_BE_FETCHED_FOR_PROF_SEL
%ASA-7-711001: IKEv2-PROTO-5: (35): SM Trace-> SA: I_SPI=03542332E12C42F4 R_SPI=3E95373B6C8C25AF (I) MsgID = 00000001 CurState: I_PROC_AUTH Event: EV_GET_POLICY_BY_PEERID

Notice that after the EV_PROC_MSG event there is no re-transmit event — in the logs I could see that eventually (after checking revocation of the certificate, etc.) the state machine leaves the I_PROC_AUTH state and the connection finally establishes.

The strongSwan configuration that caused the above problem was specifying “leftid” in /etc/ipsec.conf, i.e.

conn net-net
	leftcert=rpi.pem
	leftsubnet=10.123.1.0/24
	leftid="C=US, ST=CA, L=SF, O=Acme, OU=CSS, CN=myid, E=admin@example.com"
	leftfirewall=yes
	right=10.122.109.113
	rightid="C=US, ST=CA, L=SF, O=Acme, OU=CSS, CN=asa, E=admin@example.com"

If leftid is removed, and strongSwan is left to automatically detect the identity to send to the Cisco ASA/FTD then the problem does not occur. I think it is because it does not create the ID from scratch but instead extracts it from the identity certificate.

Here’s a diff of the output from “openssl asn1” for the case of the IKEv2 ID using an ASN.1 binary encoding that has “UTF8STRING” fields, and for the case where “PRINTABLESTRING” are used:

    15:d=1  hl=2 l=  11 cons: SET               
    17:d=2  hl=2 l=   9 cons: SEQUENCE          
    19:d=3  hl=2 l=   3 prim: OBJECT            :stateOrProvinceName
-   24:d=3  hl=2 l=   2 prim: UTF8STRING        :CA
+   24:d=3  hl=2 l=   2 prim: PRINTABLESTRING   :CA
    28:d=1  hl=2 l=  12 cons: SET               
    30:d=2  hl=2 l=  10 cons: SEQUENCE          
    32:d=3  hl=2 l=   3 prim: OBJECT            :localityName
-   37:d=3  hl=2 l=   3 prim: UTF8STRING        :SF
+   37:d=3  hl=2 l=   3 prim: PRINTABLESTRING   :SF
    42:d=1  hl=2 l=  14 cons: SET               
    44:d=2  hl=2 l=  12 cons: SEQUENCE          
    46:d=3  hl=2 l=   3 prim: OBJECT            :organizationName
-   51:d=3  hl=2 l=   5 prim: UTF8STRING        :Acme
+   51:d=3  hl=2 l=   5 prim: PRINTABLESTRING   :Acme
    58:d=1  hl=2 l=  12 cons: SET               
    60:d=2  hl=2 l=  10 cons: SEQUENCE          
    62:d=3  hl=2 l=   3 prim: OBJECT            :organizationalUnitName
-   67:d=3  hl=2 l=   3 prim: UTF8STRING        :CSS
+   67:d=3  hl=2 l=   3 prim: PRINTABLESTRING   :CSS
    72:d=1  hl=2 l=  12 cons: SET               
    74:d=2  hl=2 l=  10 cons: SEQUENCE          
    76:d=3  hl=2 l=   3 prim: OBJECT            :commonName
-   81:d=3  hl=2 l=   3 prim: UTF8STRING        :rpi
+   81:d=3  hl=2 l=   3 prim: PRINTABLESTRING   :rpi
    86:d=1  hl=2 l=  32 cons: SET               
    88:d=2  hl=2 l=  30 cons: SEQUENCE          
    90:d=3  hl=2 l=   9 prim: OBJECT            :emailAddress

Workaround for this issue: Do not use leftid and let strongSwan figure out the IKEv2 ID that it needs to present to the Cisco ASA/FTD.

Multiple Traffic Selectors Under Same Child SA

If the strongSwan configuration specifies multiple networks in one traffic selector, like in this configuration:

children {
 # CHILD_SA configuration sub-section.
 lab-vpn {
 # Local traffic selectors to include in CHILD_SA.
 # local_ts = dynamic
 local_ts = 10.123.1.0/24, 10.10.0.0/16

# Remote selectors to include in CHILD_SA.
 # remote_ts = dynamic
 remote_ts = 10.123.0.0/24
 }
 }

then the Cisco device will receive a TSi and TSr payloads in an IKEv2 message that look like these:

 TSi Next payload: TSr, reserved: 0x0, length: 56
 Num of TSs: 3, reserved 0x0, reserved 0x0
 TS type: TS_IPV4_ADDR_RANGE, proto id: 1, length: 16
 start port: 2048, end port: 2048
 start addr: 10.123.1.2, end addr: 10.123.1.2
 TS type: TS_IPV4_ADDR_RANGE, proto id: 0, length: 16
 start port: 0, end port: 65535
 start addr: 10.123.1.0, end addr: 10.123.1.255
 TS type: TS_IPV4_ADDR_RANGE, proto id: 0, length: 16
 start port: 0, end port: 65535
 start addr: 10.10.0.0, end addr: 10.10.255.255
 TSr Next payload: NOTIFY, reserved: 0x0, length: 40
 Num of TSs: 2, reserved 0x0, reserved 0x0
 TS type: TS_IPV4_ADDR_RANGE, proto id: 1, length: 16
 start port: 2048, end port: 2048
 start addr: 10.123.0.5, end addr: 10.123.0.5
 TS type: TS_IPV4_ADDR_RANGE, proto id: 0, length: 16
 start port: 0, end port: 65535
 start addr: 10.123.0.0, end addr: 10.123.0.255

As can be seen, the TSi payload contains multiple Traffic Selectors (one for 10.123.1.0/24 and another one for 10.10.0.0/16). This is based on the strongSwan configuration “local_ts = 10.123.1.0/24, 10.10.0.0/16”.

The idea is that the IPsec gateway that strongSwan is talking to should create IPsec Security Associations (SAs) for 10.123.1.0/24 <-> 10.123.0.0 and for 10.10.0.0/16 <-> 10.123.0.0.

Unfortunately, Cisco devices do not support this and instead only create SAs for the first traffic selector in the IKE message. There is a Cisco bug for this issue on Cisco ASA, but it does not appear that it will be fixed any time soon (as of May 2018):

CSCue42170 (“IKEv2: Support Multi Selector under the same child SA”)

strongSwan users have reported the problem:

https://wiki.strongswan.org/issues/758

A workaround has been proposed here. The workaround consists of creating multiple connections, one for each protect netowrk, instead of one connection with multiple protected networks.

Conclusion

A site-to-site IPsec-based VPN tunnel between Cisco ASA/FTD and strongSwan running on Linux and using certificates for authentication comes up just fine but I ran into the three issues described above. All issues have reasonable workarounds. They are probably bugs that I’ll try to report to the respective parties.

SSL Certificates Made Easy (and Cheap!)

Running an SSL-enabled website is a best practice but often made difficult by the fact that one needs a Private Key Infrastructure (PKI) to obtain the SSL certificates needed for SSL operation.

There are two options for using a PKI: 1. Deploy your own PKI, and 2. Use a public PKI. The former is cheap (free) but has a steeper learning curve because one needs to know how to set up the Certification Authority (CA) server software and how to manage the PKI (generate Certificate Signing Requests [CSRs], sign certificates, revoke certificates, deploy the root CA certificate to endusers’ devices, etc.). The latter can be non-free but is easier as the PKI is already established and one only needs to request a certificate, sometimes for a price.

The Let’s Encrypt project is “[…] a free, automated, and open certificate authority (CA), run for the public’s benefit. It is a service provided by the Internet Security Research Group (ISRG).” See https://letsencrypt.org/about/ for additional details about the Let’s Encrypt project. Two important details about certificates issued by the Let’s Encrypt project is that: 1. They are free, and 2. Browsers trust the CA that issues them, so there is no need to distribute CA root certificates to endusers’ devices.

We run an Apache web server that serves a few domains via virtual hosts and it was easy to set them up to use certificates issued by the Let’s Encrypt project. Here are the details:

We run Apache on Ubuntu so the first thing we had to do was to install an ACME client (ACME is a protocol used to fetch certificates). The ACME client recommended by the Let’s Encrypt project is called Certbot. According to the Certbot’s website, “Certbot is an easy-to-use automatic client that fetches and deploys SSL/TLS certificates for your webserver. Certbot was developed by EFF and others as a client for Let’s Encrypt and was previously known as “the official Let’s Encrypt client” or “the Let’s Encrypt Python client.” Certbot will also work with any other CAs that support the ACME protocol”.

The Certbot website has clear instructions on how to do this. For us, it was just:

shell$ sudo add-apt-repository ppa:certbot/certbot
shell$ sudo apt-get update
shell$ sudo apt-get install certbot

The next step was to request the certificates. There are Certbot “plugins” that automate the process but we chose a very manual process that gives us a little bit more control over the entire process:

shell$ sudo certbot certonly --webroot -w /srv/www/www.domain1.net/ -d domain1.net -d www.domain1.net -w /usr/share/wordpress -d www.domain2.com -d domain2.com
 Saving debug log to /var/log/letsencrypt/letsencrypt.log
 Starting new HTTPS connection (1): acme-v01.api.letsencrypt.org

-------------------------------------------------------------------------------
 You have an existing certificate that contains a portion of the domains you
 requested (ref: /etc/letsencrypt/renewal/www.domain1.net.conf)

It contains these names: www.domain1.net, domain1.net

You requested these names for the new certificate: domain1.net, www.domain1.net,
 www.domain2.com, domain2.com.

Do you want to expand and replace this existing certificate with the new
 certificate?
 -------------------------------------------------------------------------------
 (E)xpand/(C)ancel: e
 Renewing an existing certificate
 Performing the following challenges:
 http-01 challenge for domain1.net
 http-01 challenge for www.domain1.net
 http-01 challenge for www.domain2.com
 http-01 challenge for domain2.com
 Using the webroot path /usr/share/wordpress for all unmatched domains.
 Waiting for verification...
 Cleaning up challenges
 Unable to clean up challenge directory /srv/www/www.domain1.net/.well-known/acme-challenge
 Generating key (2048 bits): /etc/letsencrypt/keys/0001_key-certbot.pem
 Creating CSR: /etc/letsencrypt/csr/0001_csr-certbot.pem

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at
 /etc/letsencrypt/live/www.domain1.net/fullchain.pem. Your cert will
 expire on 2017-06-26. To obtain a new or tweaked version of this
 certificate in the future, simply run certbot again. To
 non-interactively renew *all* of your certificates, run "certbot
 renew"
 - If you like Certbot, please consider supporting our work by:

Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate
 Donating to EFF: https://eff.org/donate-le

Note that I had previously requested a certificate for www.domain1.net, and when I ran Certbot I requested a new domain to be listed in the certificate (www.domain2.com). Certbot noticed that I had previously requested a certificate for www.domain1.net and asked me if I wanted to expand the certificate to include the new domain.

As mentioned in the output from the certbot, the certificates (identity certificate for the website as well as the CA certificate) are left in the /etc/letsencrypt/live/www.domain1.net directory. At this point one just has to configure Apache to use these certificates.

Printing From Windows 7 to Remote CUPS Printer

It took me a little while to figure out how to get a machine running Windows 7 to print to a remote CUPS printer so I thought I’d document what I did in case it helps others (as well as myself as I am sure I will forget this if I don’t document it)…

The first step is to go to the “Devices and Printers” control panel. There one has to click on the “Add Printer” link at the top (also available in the right-click menu).

In the dialog window that follows, select “Add a network, wireless, or Bluetooth printer”. Windows will then try to automatically find an available network printer. At that point I stop the search by clicking on the “Stop” button, and the click on “The printer that I want isn’t listed”.

In the next dialog, “Find a printer by name or TCP/IP address”, select the option “Select a shared printer by name”, and enter an URL like the following:

http://<IP address or hostname of the CUPS server>:631/printers/<Name of the CUPS printer queue>

Then click “Next”.

The next step is important — it’s where a printer driver must be selected. Normally, the CUPS server knows what printer it has connected. In that case on needs to send print jobs in a format that the CUPS server can understand, like Postscript or PDF — the CUPS server will convert to the appropriate language understood by the printer. However, it might be the case that the CUPS server has a raw queue, in which case the CUPS client must sent the print job in the format that the printer can understand.

So, when selecting a driver in the Windows “Add Printer Wizard”, one can do the folowing:

  1. If not using a raw queue on the CUPS server, select the “Generic” manufacturer, and then the “MS Publisher Color Printer”. This will cause the print job to be of type “application/postscript”, which CUPS can then convert to the right printer language.
  2. If using a raw queue on the CUPS server then select the appropriate printer driver so the Windows client sends the job in the format that the printer can understand.

References

“[…] install the native printer drivers for your printer on the Windows computer. If the CUPS server is set up to use its own printer drivers, then you can just select a generic postscript printer for the Windows client(e.g. ‘HP Color LaserJet 8500 PS’ or ‘Xerox DocuTech 135 PS2’).”

Note that I didn’t have luck using the “HP Color LaserJet 8500 PS” printer driver — it would generate a printer job in the “PJL encapsulated PostScript document text” format, which CUPS would have problems handling. But the “MS Publisher Color Printer” worked fine.

  • This page contains good information on how to create a CUPS raw printer queue:

http://opennomad.com/content/raw-cups-configuration-challenge

 

CUPS strikes again, this time with “Bad Request”

The machine running our CUPS print server at home recently had to be replaced by another one. Today we had to set up a new CUPS client to print to the CUPS server running on the new machine.

It turns out it was not an easy experience — the CUPS client kept saying “Bad Request” and there was nothing in the CUPS log files on the server side. We then realized that we were logging at a level that could be hiding important messages, based on this parameter in /etc/cups/cupsd.conf:

LogLevel warn

Changing this parameter to:

LogLevel debug

and trying again to print or modify the printer produced the following messages in /var/log/cups/error.log:

D [04/Dec/2016:16:30:58 -0500] [Client 1] GET /printers/HP_Officejet_2620_series.ppd HTTP/1.1
D [04/Dec/2016:16:30:58 -0500] cupsdSetBusyState: newbusy="Active clients and dirty files", busy="Dirty files"
D [04/Dec/2016:16:30:58 -0500] [Client 1] Read: status=200
D [04/Dec/2016:16:30:58 -0500] [Client 1] No authentication data provided.
E [04/Dec/2016:16:30:58 -0500] [Client 1] Request from "[v1.2003:1480:eca5:1120:7e7a:91ff:febf:d3b2]" using invalid Host: field "printer.example.com:631".
D [04/Dec/2016:16:30:58 -0500] [Client 1] cupsdSendHeader: code=400, type="text/html", auth_type=0
D [04/Dec/2016:16:30:58 -0500] [Client 1] Closing because Keep-Alive is disabled.
D [04/Dec/2016:16:30:58 -0500] [Client 1] Closing connection.

Aha! ‘Request from “[v1.2003:1480:eca5:1120:7e7a:91ff:febf:d3b2]” using invalid Host: field’!

Now with an actual error message to search for I was able to find Debian bug #530027, which has as a “workaround” to set “ServerAlias *” in /etc/cups/cupsd.conf.

After making that change and reloading CUPS things started to work. Note that the root cause of the problem is that the hostname of the print server changed, and the CUPS client was using the hostname of the old server via a DNS CNAME.

Beats me why such an error message is not produced unless the CUPS logging level is set to “debug”.

CUPS Remote Printing Filter Failed

I keep getting bitten by this issue every time I set up a new printer on a new machine to print to a remote CUPS server that has the physical printer connected to it (via USB, for example), and every time I run into the same problem it takes me a little while until I remember what the probblem is. So, I have decided to document the problem (and the solution) for next time I run in the problem.

The printer is correctly configured on the server, and by “correctly configured” I mean that the printer is using the correct driver, and that printing is working, both locally on the machine with the attached printer, and remotely on other clients.

On the client side, the URI for the printer is correct; for example ipp://servername.example.com:631/printers/printer_name. Everything seems fine until a document is sent to the printer on the client — when this happens, the server’s cupsd (not the client’s), generates runs into a problem and nothing gets printed. The server’s cupsd error file (typically /var/log/cups/error_log) contains something like this:

D [16/Oct/2016:21:25:45 -0400] [Job 2] Queued on "HP_Officejet_2620_series" by "username".
 D [16/Oct/2016:21:25:45 -0400] [Job 2] File of type application/vnd.cups-raster queued by "username".
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Adding end banner page "none".
 D [16/Oct/2016:21:25:45 -0400] [Job 2] time-at-processing=1476667545
 D [16/Oct/2016:21:25:45 -0400] [Job 2] 1 filters for job:
 D [16/Oct/2016:21:25:45 -0400] [Job 2] hpcups (application/vnd.cups-raster to printer/HP_Officejet_2620_series, cost 0)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] job-sheets=none,none
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[0]="HP_Officejet_2620_series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[1]="2"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[2]="username"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[3]="Untitled Document 1"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[4]="1"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[5]="job-uuid=urn:uuid:09c01fb4-c9b4-39de-6e4c-3d33a0710d25 job-originating-host-name=[v1.2002:4170:e35:1:81ae:ffff:ffff:83a7] date-time-at-creation= date-time-at-processing= time-at-creation=1476667545 time-at-processing=1476667545"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] argv[6]="/var/spool/cups/d00002-001"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[0]="CUPS_CACHEDIR=/var/cache/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[1]="CUPS_DATADIR=/usr/share/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[2]="CUPS_DOCROOT=/usr/share/cups/doc"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[3]="CUPS_FONTPATH=/usr/share/cups/fonts"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[4]="CUPS_REQUESTROOT=/var/spool/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[5]="CUPS_SERVERBIN=/usr/lib/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[6]="CUPS_SERVERROOT=/eD [16/Oct/2016:21:25:45 -0400] [Job 2] envp[26]="PRINTER=HP_Officejet_2620_series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[27]="PRINTER_STATE_REASONS=none"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[28]="CUPS_FILETYPE=document"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[29]="FINAL_CONTENT_TYPE=printer/HP_Officejet_2620_series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[30]="AUTH_I****"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Started filter /usr/lib/cups/filter/hpcups (PID 5259)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Started backend /usr/lib/cups/backend/hp (PID 5260)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] PID 5259 (/usr/lib/cups/filter/hpcups) stopped with status 1.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Hint: Try setting the LogLevel to "debug" to find out more.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] PID 5260 (/usr/lib/cups/backend/hp) exited with no errors.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] prnt/hpcups/HPCupsFilter.cpp 565: cupsRasterOpen failed, fd = 6
 D [16/Oct/2016:21:25:45 -0400] [Job 2] prnt/backend/hp.c 919: ERROR: null print job total=0
 D [16/Oct/2016:21:25:45 -0400] [Job 2] End of messages
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state=3(idle)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state-message="Filter failed"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state-reasons=none
 tc/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[7]="CUPS_STATEDIR=/run/cups"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[8]="HOME=/var/spool/cups/tmp"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[9]="PATH=/usr/lib/cups/filter:/usr/bin:/usr/bin:/bin:/usr/bin"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[10]="SERVER_ADMIN=root@server.example.com"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[11]="SOFTWARE=CUPS/2.2.1"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[12]="TMPDIR=/var/spool/cups/tmp"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[13]="USER=root"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[14]="CUPS_MAX_MESSAGE=2047"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[15]="CUPS_SERVER=/run/cups/cups.sock"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[16]="CUPS_ENCRYPTION=IfRequested"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[17]="IPP_PORT=631"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[18]="CHARSET=utf-8"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[19]="LANG=en_US.UTF-8"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[20]="PPD=/etc/cups/ppd/HP_Officejet_2620_series.ppd"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[21]="RIP_MAX_CACHE=128m"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[22]="CONTENT_TYPE=application/vnd.cups-raster"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[23]="DEVICE_URI=hp:/usb/Officejet_2620_series?serial=CN4654G2BG0600"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[24]="PRINTER_INFO=HP Officejet 2620 series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[25]="PRINTER_LOCATION=My desk"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[26]="PRINTER=HP_Officejet_2620_series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[27]="PRINTER_STATE_REASONS=none"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[28]="CUPS_FILETYPE=document"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[29]="FINAL_CONTENT_TYPE=printer/HP_Officejet_2620_series"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] envp[30]="AUTH_I****"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Started filter /usr/lib/cups/filter/hpcups (PID 5259)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Started backend /usr/lib/cups/backend/hp (PID 5260)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] PID 5259 (/usr/lib/cups/filter/hpcups) stopped with status 1.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] Hint: Try setting the LogLevel to "debug" to find out more.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] PID 5260 (/usr/lib/cups/backend/hp) exited with no errors.
 D [16/Oct/2016:21:25:45 -0400] [Job 2] prnt/hpcups/HPCupsFilter.cpp 565: cupsRasterOpen failed, fd = 6
 D [16/Oct/2016:21:25:45 -0400] [Job 2] prnt/backend/hp.c 919: ERROR: null print job total=0
 D [16/Oct/2016:21:25:45 -0400] [Job 2] End of messages
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state=3(idle)
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state-message="Filter failed"
 D [16/Oct/2016:21:25:45 -0400] [Job 2] printer-state-reasons=none

The problem is that “Filter failed”, and my understanding of the root cause of the problem is that the printer is configured on the client with the correct make and model for the printer that is physically connected to the server. So, if the printer is, for example, an Hewlett-Packard printer, the client will render the print job and send to the server the rendered job. The server is expecting the job in some other format (Postscript? PDF? It’s not important) and when it receives it in rendered format, for the exact printer make and model, then the filter on the server fails to render the print job.

The solution is to configure the printer on the client as a “raw” printer, i.e. a printer where the printer driver is not specified. This way the client sends the job “unrendered” and lets the server do the rendering according to the correct printer driver that is installed (on the server).

I remember that when I first ran into this problem it was not easy to figure out what was wrong. I enabled all the debugging knobs that I could find and nothing helped. It probably was some post to some random blog or Internet forum what gave me a clue, but it was not easy to find.

And to add insult to injury, using the Printer control panel on Ubuntu to modify the default settings of a printer (on a CUPS client), would change the local printer configuration from “raw” to a specific make and model, which would then trigger the problem explained above. This made me scratch my head and waste hours trying to get a previously working printer that stopped working to working state again.

Ggggrrrrr.

References

This RedHat bug report has good information on the issue and how clients and servers should be configured:

https://bugzilla cialis 5mg preis.redhat.com/show_bug.cgi?id=1010580

This ArchLinux forum discussion is very relevant to the problem:

https://bbs.archlinux.org/viewtopic.php?pid=1589908#p1589908

Finally, this ArchLinux wiki page:

https://wiki.archlinux.org/index.php/CUPS#Network_2

contains the following note, which describes precisely what the issue is:

Warning: Avoid configuring both the server and the client with a printer filter – either the print queue on the client or the server should be ‘raw’. This avoids sending a print job through the filters for a printer twice, which can cause problems (for instance, [3]). See #Usage for an example of setting a print queue to ‘raw’.”

Reset Azure Virtual Netwok Gateway

No matter what I did I could not get an IPsec site-to-site tunnel going between an offsite test network and our Microsoft Azure virtual network. Our VPN gateway is a Cisco ASA 5506.

The issue was that the Cisco ASA would try to bring up the tunnel but some part of the negotiation would go wrong at some point. Debug messages on the Cisco ASA would show something like this:

Apr 19 09:21:41 [IKEv1 DEBUG]IP = 13.94.202.38, IKE SA Proposal # 1, Transform # 2 acceptable Matches global IKE entry # 5
 Apr 19 09:21:41 [IKEv1 DEBUG]IP = 13.94.202.38, constructing ISAKMP SA payload
 Apr 19 09:21:41 [IKEv1 DEBUG]IP = 13.94.202.38, constructing NAT-Traversal VID ver RFC payload
 Apr 19 09:21:41 [IKEv1 DEBUG]IP = 13.94.202.38, constructing Fragmentation VID + extended capabilities payload
 Apr 19 09:21:41 [IKEv1]IP = 13.94.202.38, IKE_DECODE SENDING Message (msgid=0) with payloads : HDR + SA (1) + VENDOR (13) + VENDOR (13) + NONE (0) total length : 128
 Apr 19 09:21:42 [IKEv1]IP = 13.94.202.38, Duplicate first packet detected. Ignoring packet.
 Apr 19 09:21:43 [IKEv1]IP = 13.94.202.38, Duplicate first packet detected. Ignoring packet.
 Apr 19 09:21:46 [IKEv1]IP = 13.94.202.38, Duplicate first packet detected. Ignoring packet.
 Apr 19 09:21:49 [IKEv1]IP = 13.94.202.38, IKE_DECODE RESENDING Message (msgid=0) with payloads : HDR + SA (1) + VENDOR (13) + VENDOR (13) + NONE (0) total length : 128
 Apr 19 09:21:57 [IKEv1]IP = 13.94.202.38, IKE_DECODE RESENDING Message (msgid=0) with payloads : HDR + SA (1) + VENDOR (13) + VENDOR (13) + NONE (0) total length : 128
 Apr 19 09:22:05 [IKEv1]IP = 13.94.202.38, IKE_DECODE RESENDING Message (msgid=0) with payloads : HDR + SA (1) + VENDOR (13) + VENDOR (13) + NONE (0) total length : 128
 Apr 19 09:22:13 [IKEv1 DEBUG]IP = 13.94.202.38, IKE MM Responder FSM error history (struct &0x00002aaac1c34cf0) , : MM_DONE, EV_ERROR-->MM_WAIT_MSG3,
 EV_TIMEOUT-->MM_WAIT_MSG3, NullEvent-->MM_SND_MSG2, EV_SND_MSG-->MM_SND_MSG2, EV_START_TMR-->MM_SND_MSG2, EV_RESEND_MSG-->MM_WAIT_MSG3, EV_TIMEOUT-->MM_WAIT_MSG3, NullEvent
 Apr 19 09:22:13 [IKEv1 DEBUG]IP = 13.94.202.38, IKE SA MM:4af079c0 terminating: flags 0x01000002, refcnt 0, tuncnt 0
 Apr 19 09:22:13 [IKEv1 DEBUG]IP = 13.94.202.38, sending delete/delete with reason message
 Apr 19 09:23:06 [IKEv1]IP = 13.94.202.38, IKE_DECODE RECEIVED Message (msgid=6a9f34a4) with payloads : HDR + HASH (8) + DELETE (12) + NONE (0) total length : 68
 Apr 19 09:23:06 [IKEv1 DEBUG]Group = 13.94.202.38, IP = 13.94.202.38, processing hash payload
 Apr 19 09:23:06 [IKEv1 DEBUG]Group = 13.94.202.38, IP = 13.94.202.38, processing delete
 Apr 19 09:23:06 [IKEv1]Group = 13.94.202.38, IP = 13.94.202.38, Connection terminated for peer 13.94.202.38. Reason: Peer Terminate Remote Proxy 10.100.152.0, Local Proxy 10.50.0.0
 Apr 19 09:23:06 [IKEv1 DEBUG]Group = 13.94.202.38, IP = 13.94.202.38, Active unit receives a delete event for remote peer 13.94.202.38.

A couple of key points in the above debug messages:

  1. “MM_WAIT_MSG3, EV_TIMEOUT” indicates that the Cisco ASA timeout waiting for the Azure VPN gateway.
  2. “Duplicate first packet detected. Ignoring packet” indicates that the Azure VPN gateway is not liking the previous message that the Cisco ASA sends. Increasing the debug level (not shown above) indicates a mismatch in terms of cookies, and this is apparently what upsets the Azure Virtual Network Gateway.

This is shown on Cisco ASA debug messages at a higher debug level:

Azure
 InitiatorCookie: 03 83 AD 7C 10 26 CB D6
 ResponderCookie: 14 42 19 27 F6 F2 DF 53

RECV PACKET from 13.91.5.150
 ISAKMP Header
 Initiator COOKIE: 03 83 ad 7c 10 26 cb d6
 Responder COOKIE: 00 00 00 00 00 00 00 00

These are debug messages produced on the Microsoft Azure side:

2016?-?03?-?02 10:31:37 ERROR user NULL 0000000FE1E59D80 0000000FE1E64320 f74513382e60832f cac68571e57c06d5 Invalid cookies. Try resetting SAs on-prem. IkeProcessPacketDispatch failed with HRESULT 0x80073616(ERROR_IPSEC_IKE_INVALID_COOKIE)

(Note the “ERROR_IPSEC_IKE_INVALID_COOKIE” error code.)

After spending sometime troubleshooting the Cisco ASA side we could not find anything wrong with the Cisco ASA configuration.

In the end, in my desperation, I decided to reset the Azure Virtual
Network Gateway and that seems to have fixed the issue for good.

The process to reset an Azure Virtual Network Gateway is a bit tricky because there is no way to do that using the Azure Portal; it needs to be done using PowerShell instead.

This is what I did to reset the Azure Virtual Network Gateway using
PowerShell:

1. Install Azure PowerShell. I used the instructions here:

https://azure.microsoft.com/en-us/documentation/articles/powershell-install-configure/

In particular, I went with the leaner and perhaps more complicated
installation from the PowerShell Gallery (instead of installing from
WebPI).

2. After Azure PowerShell was installed, I opene a PowerShell command window and ran the following commands:

Login-AzureRmAccount
Select-AzureRmSubscription -SubscriptionName "<your subscription name>"
$vg = Get-AzureRmVirtualNetworkGateway -ResourceGroupName RG
Reset-AzureRmVirtualNetworkGateway -VirtualNetworkGateway $vg

Apparently, Azure Virtual Network Gateways are redundant Virtual Machines so resetting one will cause the other to take over.

The other one could be reset by invoking “Reset-AzureRmVirtualNetworkGateway” a few minutes after the first gateway has been reset but in my case the site-to-site VPN tunnel came up after resetting only one of the gateways.

Note that the above XXXXX-AzureRmXXXXX PowerShell cmdlets use the new
Azure Resource Manager deployment model. Similar commands would have to
be used if the classic deployment model is used instead.

This article:

https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-resetgw-classic/

is a good reference for how to reset Azure Virtual Network Gateways that have been deployed using the classic deployment model. Note that it says that the same cannot be done for the Resource Manager deployment model but I think the capability is there now (I used it) and it is just that the article has not been updated yet.

On a related note, I should mention that another way of dealing with this problem is by deploying a Cisco ASAv virtual appliance and using that to terminate the site-to-site IPsec tunnel instead of terminating it on the Microsoft-provided Azure Virtual Network Gateway. This of course would be more expensive given that licenses for the Cisco ASAv would have to be purchased, plus it is another Virtual Machine that would have to deployed (and pay for).

SSL Traffic Decryption

We recently had a need to inspect the contents of an HTTPS (SSL/TLS) connection. As we had never had the opportunity to set things up to facilitate decryption of SSL/TLS connection we had to do a little bit of research.

The way we approached this was by running the software that establishes the HTTPS connection we need to decrypt on a VirtualBox Virtual Machine (VM), and then running a Man-in-the-Middle (MitM) proxy on the VM host, which runs Ubuntu 15.04. The MitM proxy that we used was mitmproxy. No reason in particular for choosing mitmproxy other than it was the first solution that we tried, it was very well documented, and it worked on first try. We are very impressed with this little piece of software — its design is well thought, and the text-based user interface is very powerful.

This post documents the steps involved in setting things up for decryption of SSL sessions.

General

There are several possible network topologies to use. The one that we chose was one where the client machine and the proxy machine are on the same physical network. Because we are using VirtualBox, where the Virtual Machine is the client machine and the Virtual Machine host is a physical machine, we configured the network settings of the (client) Virtual Machine to use bridged networking. This is equivalent to having two different machines on the same physical network segment.

Note: Our proxy machine (not a Virtual Machine) only had a wi-fi network interface so the Virtual Machine, through bridged networking, was using this wi-fi network interface to reach the network.

Set Up Of Proxy Machine

Installation

Nothing much to do here, really, as there is an Ubuntu binary package for mitmproxy, so installation boils down to a simple “apt-get install mitmproxy”.

VirtualBox Settings

The network interface of the virtual machine must be configured in bridged mode. The VM host machine only needs one interface (for example, the wireless NIC “wlan0”). That interface will be used for both the VM host and the actual VM to have network connectivity. Make sure the VM NIC is configured to use the VM host NIC as the bridge interface.

Also, VirtualBox must be configured to allow promiscuous mode on the bridge interface. This is configured in the “Advanced” section of the network adapter properties (where the interface mode [bridged, NAT, etc.] is configured). “Allow VMs” for the “Promiscuous Mode” setting is appropriate.

Configuration

After installing the mitmproxy software, the following things must be done:

  • Enable IP forwarding, which is normally disabled by default:
shell$ sudo sh -c 'echo 1 > /proc/sys/net/ipv4/ip_forward'
  • Disable ICMP redirects:
shell$ sudo sh -c 'echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects'
shell$ sudo sh -c 'echo 0 > /proc/sys/net/ipv4/conf/wlan0/send_redirects'
  • Add iptables rules to redirect traffic going to destination TCP port 443 to port 8080, which is where mitmproxy is listening on:
shell$ sudo iptables -t nat -A PREROUTING -i wlan0 -p tcp --dport 80 -j REDIRECT --to-port 8080
shell$ sudo iptables -t nat -A PREROUTING -i wlan0 -p tcp --dport 443 -j REDIRECT --to-port 8080
  • Run mitmproxy:
shell$ mitmproxy -T --host

Setup Of Client Machine

The following things need to be configured on the client machine:

  • Configure the machine running mitmproxy (the proxy machine) as the default gateway. This will cause all SSL/TLS traffic going towards the server to be sent through the proxy machine, assuming that the server is on a different (remote) subnet.
  • Install the Certificate Authority certificate that the proxy machine will present to the client when the client establishes SSL/TLS sessions. mitmproxy really shines in this area, making the certificate installation a very seamless process. We will not repeat here the excellent documentation on how to do this. Instead, we will point readers to the documentation: http://mitmproxy.org/doc/certinstall.html.
  • The default gateway of the client machine must obviously be the proxy machine. The easiest way to accomplish this is by configuring manually the TCP/IP settings of the client machine. If DHCP is used for IP configuration then the default gateway will be whatever the DHCP sends, which might be different from the IP address of the proxy machine. In that case the client machine can be forced to use the proxy machine as its default gateway by adding a new default route using a lower metric, for example: ip route add default via <IP address of proxy machine> metric 50.

Decrypting The SSL/TLS Session

Once the machine running the mitmproxy software (the “proxy” machine) and the machine running the SSL/TLS client (the “client” machine) are configured, we are ready to establish the SSL/TLS sessions that we want to decrypt — just open your browser and go to the https:// URL you are interested in examining, launch your SSL VPN client, etc.

The proxy machine will intercept the connection and do what it does well, i.e. pretend to the server to be the client, and pretend to the client to be the server, while decrypting traffic going in both directions.

mitmproxy provides a fantastic text-based user interface that allows the user to easily navigate through each SSL/TLS request and response going through the proxy.

The following screenshot (click on image for larger version) shows the main mitmproxy window, which lists all the captured flows:

mitmproxy1

The following screenshot (click on image for larger version) shows a particular flow, especifically the request part of the flow:

mitmproxy2

And finally, this screenshot (click on image for larger version) shows the server’s response to the previous request:

mitmproxy3

And that is it; there really isn’t anything to it. It took longer to read the mitmproxy documentation than to set things up and run the SSL/TLS session.

Final Thoughts

From the main mitmproxy window, all flows can be saved to a file for later analysis by pressing the ‘w’ (write) key, which will prompt if all flows must be saved or just the one at the cursor, and the name of the file to save the flows to.

Flows can be loaded later by running mitmproxy with the -r (read) switch.

 Caveats

Be aware of mitmproxy bug 659 (https://github.com/mitmproxy/mitmproxy/issues/659). This bug causes HTTP HEAD requests to return a Content-Length equal to zero instead of the correct value. This will cause some applications to fail as they will think there is nothing to download. This trip me up pretty good until I found the previously mentioned bug and I applied the fix that was committed to resolve the bug.

 

FreeRADIUS: Active Directory authentication and group check via winbind + rlm_unix, not LDAP

[This blog post is based on an email that I sent to the freeradius-users mailing list in September 2014.]

I have a pretty common requirement: authenticate wireless users against Active Directory and prevent SSID cross-connections, i.e. users in Active Directory group A can only connect to SSID A and users in Active Directory group B can only connect to SSID B.

I have seen plenty of messages in the freeradius-users mailing list archives about how to accomplish this. The authentication part is easy and has excellent documentation, e.g.

http://deployingradius.com/documents/configuration/active_directory.html

The group checking part is also well understood and documented (rlm_ldap), and whoever asks apparently gets told to use LDAP. At least I could not find anything on the alternate approach described here.

I decided to look into a different approach, which does not involve LDAP — because the machine already has winbind running (for ntlm_auth), why not to use the Name Service Switch and rlm_unix to check for group membership and avoid usind LDAP altogether? Group membership information is already available if one has added “winbind” to “passwd” and “group” in /etc/nsswitch.conf.

I apparently got this to work and wanted to share the solution in case someone finds it useful:

First, I configured winbind for ntlm_auth use by FreeRADIUS, as explained elsewhere. Then, I verified that the system can see Active Directory users and groups as if they were Unix users and groups:

shell$ id DOMAIN\\username
uid=10017(DOMAIN\username) gid=10002(DOMAIN\domain users) groups=10002(DOMAIN\domain users),10024(DOMAIN\computerlab-monitoring),10008(DOMAIN\domain admins),10034(DOMAIN\vdi teachers),10010(DOMAIN\teachers),10026(DOMAIN\vdi users),10011(DOMAIN\teacher assistants),10074(DOMAIN\schema admins)

Then, I put this logic, which is similar to what one would normally use if LDAP and Ldap-Groups were in use, in the post-authentication section of my sites-enabled/default:

if (NAS-Port-Type == Wireless-802.11) {
  if (Called-Station-Id =~ /.*:SSID-A/i) {
    # Can't do 'if (Group != "xxxxx")' because !=
    # operator doesn't work for group checking. Careful
    # with the number of backslashes.
    if (!(Group == "DOMAIN\\\\group A") ) {
      update reply {
        Reply-Message = "User not allowed to join this wireless network"
      }
      reject
    }
  }
  elsif (Called-Station-Id =~ /.*:SSID-B/i) {
    if (!(Group == "DOMAIN\\\\group B") ) {
      update reply {
        Reply-Message = "User not allowed to join this wireless network"
      }
      reject
    }
  }
}

This works if the EAP identity is “DOMAIN\username”. However, I did not want to make things unnecessarily complicated for my users and wanted them to be able to enter just “username” when they configure their devices.

The main issue to address, however, is that if the identity is entered as “username” (not “DOMAIN\username”), the group check will fail because the Unix user ID for users that are known to the system via winbind is “DOMAIN\username”, not “username” (see output from the “id” command above).

“Not a problem”, I thought, “I’ll just manipulate User-Name before anything happens and prefix it with “DOMAIN\”. Turns out that was a bad idea because that made User-Name different than the EAP identity hidden in the EAP message, which caused the “rlm_eap: identity does not match User-Name, setting from EAP identity” message that has bitten so many people before.

The solution I came up with was to still manipulate the User-Name but towards the end, in the post-auth section, instead of at the beginning.  This way EAP uses the right identity, but the rlm_unix Group check uses the correct “DOMAIN\username” User-Name.

The final configuration looks like this:

if (NAS-Port-Type == Wireless-802.11) {
  # If User-Name doesn't contain our domain then add it.
  # It's needed for the Group check to use the correct
  # username.
  if (User-Name !~ /DOMAIN\\\\/i) {
    update request {
      User-Name := "DOMAIN\\\\\\\\%{User-Name}"
    }
  }

  if (Called-Station-Id =~ /.*:SSID-A/i) {
    # Can't do 'if (Group != "xxxxx")' because !=
    # operator doesn't work for group checking. Careful
    # with the number of backslashes.
    if (!(Group == "DOMAIN\\\\group A") ) {
      update reply {
        Reply-Message = "User not allowed to join this wireless network"
      }
      reject
    }
  }
  elsif (Called-Station-Id =~ /.*:SSID-B/i) {
    if (!(Group == "DOMAIN\\\\group B") ) {
      update reply {
        Reply-Message = "User not allowed to join this wireless network"
      }
      reject
    }
  }
}

I do not know about the performance impact but this should not require additional network traffic to check for group membership because winbind caches this information.

Another possible advantage is redundancy: Using winbindd (I theorize, I am not sure about this) provides redundancy because the group membership comes from the domain controller, which is found using DNS lookups — if a controller goes down then another (hopefully) takes its place and winbindd will be able to find it with no configuration changes.

This approach seemed simple to me — no additional configuration other than manipulating the User-Name in the post-authentication phase.

Things can also be made to work if the user chooses to configure the supplicant with “DOMAIN\user” as the identity — in this case one needs to configure “with_ntdomain_hack = yes” in modules/mschap, create an empty “DOMAIN” realm in proxy.conf, and enable the ntdomain realm in the authorize section of sites-enabled/inner-tunnel. Doing this will allow both “DOMAIN\user” and just “user” to work.

I have this being used in production on a small network (less than 50 users) and have not encountered any show stoppers yet. If you read the freeradius-users thread referenced at the top of this post you will notice that modifying User-Name is discouraged. I was told to use Stripped-User-Name instead but I never had a chance to go back and look into it.  Caveat emptor.

Update May 22, 20016: I had to set up another Linux server for winbindd. I ran into a couple of minor problems that took me some time to troubleshoot and fix. First, it seems like the Ubuntu 14.04 upstart script configures the wrong winbindd privileged directory — instead of /var/lib/samba/winbindd_privileged/ it configures /var/run/samba/winbindd_privileged/. The fix is to edit the upstart job configuration file (/etc/init/winbind.conf) and change it to use the correct directory path. Second, “wbinfo -a user%password” will not work when testing things (as suggested at the deployingradius.com page above) unless the user running wbinfo is a member of the group “winbindd_priv”. And third, running “radtest -t mschap paris mG2eudPas 127.0.0.1:18120 0 testing123”, also suggested at the deployingradius.com page above, will not work unless the user the FreeRADIUS daemon run as is a member of the group “winbindd_priv” — this is also required for correct FreeRADIUS operation. So, it is minor issues but they can be a hassle to troubleshoot.

Cisco ASA, “show service-policy”, and SNMP

Recently, a co-worker asked if it is possible to obtain the counters shown in the output from the command “show service-policy” on a Cisco ASA. I did not know the answer so I had to do a little bit of digging…

The list of MIBs supported by the Cisco ASA is documented here.

Based on a quick reading of that document, it seemed like CISCO-UNIFIED-FIREWALL-MIB could have provided this information, *if* it had been completely implemented. However, there is a documented caveat for CISCO-UNIFIED-FIREWALL-MIB at the above page:

“Limited support for objects under cuFwConnectionGrp and cuFwUrlFilterGrp.”

And an snmpwalk confirmed that the information is not there:

paris@bethlehem[1]:~$ snmpwalk -m CISCO-UNIFIED-FIREWALL-MIB -Os -v2c -c ****** 1.2.3.4 ciscoUnifiedFirewallMIB
cufwConnGlobalNumResDeclined.0 = Counter64: 0 Connections
cufwConnGlobalNumActive.0 = Gauge32: 168 Connections
cufwConnGlobalConnSetupRate1.0 = Gauge32: 2 Connections per second
cufwConnGlobalConnSetupRate5.0 = Gauge32: 0 Connections per second
cufwConnSetupRate1.udp = Gauge32: 1 Connections Per Second
cufwConnSetupRate1.tcp = Gauge32: 0 Connections Per Second
cufwConnSetupRate5.udp = Gauge32: 0 Connections Per Second
cufwConnSetupRate5.tcp = Gauge32: 0 Connections Per Second
cufwUrlfRequestsNumProcessed.0 = Counter64: 0 Requests
cufwUrlfRequestsProcRate1.0 = Gauge32: 0 Requests per second
cufwUrlfRequestsProcRate5.0 = Gauge32: 0 Requests per second
cufwUrlfRequestsNumAllowed.0 = Counter64: 0 Requests
cufwUrlfRequestsNumDenied.0 = Counter64: 0 Requests
cufwUrlfRequestsDeniedRate1.0 = Gauge32: 0 Requests per second
cufwUrlfRequestsDeniedRate5.0 = Gauge32: 0 Requests Per Second
cufwUrlfRequestsNumCacheAllowed.0 = Counter64: 0 Requests
cufwUrlfRequestsNumCacheDenied.0 = Counter64: 0 Requests
cufwUrlfRequestsNumResDropped.0 = Counter64: 0 Requests
cufwUrlfRequestsResDropRate1.0 = Gauge32: 0 Requests Per Second
cufwUrlfRequestsResDropRate5.0 = Gauge32: 0 Requests Per Second
cufwUrlfNumServerTimeouts.0 = Counter64: 0
cufwUrlfNumServerRetries.0 = Counter64: 0
paris@bethlehem[1]:~$

That does not mean that another MIB cannot provide the information we are looking for. However, the “sh snmp-server oidlist” command doesn’t show any promising OIDs so it seems like we are out of luck.

Useful References

https://supportforums.cisco.com/document/7336/snmp-mibs-and-traps-asa-additional-information