USACO 2022 US Open Contest, Silver Problem 2. Subset Equality

USACO 2022 US Open Contest, Silver Problem 2. Subset Equality

The cows are trying out a new method of exchanging coded messages with each-other where they mix irrelevant letters in among relevant letters to make the messages hard to decode.

The cows transmit two strings ss and tt each of length at most 105105 consisting only of the lowercase English letters 'a' through 'r'. To try and make sense of this coded message, you will be given QQ queries (1Q1051≤Q≤105). Each query provides a subset of the lowercase English letters from 'a' to 'r.' You need to determine for each query whether ss and tt, when restricted only to the letters in the query, are equal.

INPUT FORMAT (input arrives from the terminal / stdin):

First line contains ss.Second line contains tt.

Third line contains QQ.

Next QQ lines each contain a query string. Within a query string, no letters are repeated. Furthermore, all query strings are in sorted order, and no query string appears more than once.

OUTPUT FORMAT (print output to the terminal / stdout):

For each query, print 'Y' if ss and tt, when restricted only to the letters in the query, are equal, or 'N' otherwise.

SAMPLE INPUT:

aabcd
caabd
4
a
ac
abd
abcd


SAMPLE OUTPUT:

YNYN


For the first query, both strings become "aa" when restricted only to 'a.'

For the second query, the first string becomes "aac" while the second string becomes "caa."

SCORING:

• Test case 2 satisfies |s|,|t|,Q1000|s|,|t|,Q≤1000.
• Test cases 3-11 satisfy no additional constraints.

Problem credits: Danny Mittal

USACO 2022 US Open Contest, Silver Problem 2. Subset Equality 题解(翰林国际教育提供，仅供参考)

[/hide]

(Analysis by Danny Mittal)

A starting point for this problem is to notice that if ss and tt are equal when restricted to some subset XX of letters, then they must also be equal when restricted to any subset YY of XX. In particular, this means that if ss and tt are equal when restricted to some subset XX of letters, they are also equal when restricted to any two letters in XX.

Now consider the case where ss and tt are not equal when restricted to XX. Again let ss′ and tt′ be ss and tt restricted to XX. One possibility is that ss′ and tt′ are of different lengths, which means that ss and tt have differing amounts of the letters in XX.

The other possibility is that ss′ and tt′ have different letters at some position. In this case, consider the first position at which ss′ and tt′ differ, and let xx and yy be the two letters that ss′ and tt′ have at this position. Since ss′ and tt′ are completely the same up to this position, if we remove all letters other than xx and yy from ss′ and tt′ to create new strings s′′s″ and t′′t″, the portion before this position will become the same (smaller) portion at the beginning of s′′s″ and t′′t″, and so the xx and yy at the same position in ss′ and tt′ will still be in the same position in s′′s″ and t′′t″. This means that s′′s″ and t′′t″ are not the same, and so, since s′′s″ and t′′t″ are actually just ss and tt restricted to the two letters xx and yy, we can conclude that ss and tt are not the same when restricted to xx and yy.

We have therefore shown that if ss and tt are not equal when restricted to XX, then either they do not have the same amount of letters in XX, or they are not the same when restricted to some pair of letters in XX. However, we already know that if ss and tt are equal when restricted to XX, then they must be the same when restricted to any pair of letters in XX, and in this case they clearly must also have the same amount of letters in XX.

This means that ss and tt being equal when restricted to XX is actually equivalent to having the same amount of letters in XX and being equal when restricted to any pair of letters in XX.

We can use this fact to write our algorithm. We need to be able to quickly compute, for a given subset of letters XX, whether ss and tt have the same amount of letters in XX, and whether ss and tt are the same when restricted to any pair of letters in XX.

We can make checking the first condition easy by precomputing the frequencies of each letter in each of ss and tt. This takes linear time. Then, for a given XX, we simply add up the frequencies of all letters in XX for each of ss and tt and check if the sums are equal.

We can make checking the second condition easy by just precomputing the answer for each pair of letters. For each pair, we can find the answer in linear time by simply reducing ss and tt to the strings ss′ and tt′ that only have the letters in the pair, then checking whether ss′ and tt′ are equal. Then, for a given XX, we simply loop through all pairs of letters in XX and check our precomputed answers.

The overall runtime becomes O(number of pairs(|s|+|t|+Q))O(number of pairs⋅(|s|+|t|+Q)) due to the linear time precomputation for each pair and then checking each pair in constant time for each query. Since there are only 1818 letters we need to consider, there are only 18172=15318⋅172=153 pairs, which is small enough for this algorithm to be reasonably efficient.

Java code:

import java.io.BufferedReader;
import java.io.IOException;

public class SubsetEquality {

public static void main(String[] args) throws IOException {
int[] freqsS = new int[26];
int[] freqsT = new int[26];
for (char x = 'a'; x <= 'z'; x++) {
for (char letter : s) {
if (letter == x) {
freqsS[x - 'a']++;
}
}
for (char letter : t) {
if (letter == x) {
freqsT[x - 'a']++;
}
}
}
boolean[][] compatible = new boolean[26][26];
for (char y = 'a'; y <= 'z'; y++) {
for (char x = 'a'; x < y; x++) {
StringBuilder sRestricted = new StringBuilder();
StringBuilder tRestricted = new StringBuilder();
for (char letter : s) {
if (letter == x || letter == y) {
sRestricted.append(letter);
}
}
for (char letter : t) {
if (letter == x || letter == y) {
tRestricted.append(letter);
}
}
compatible[x - 'a'][y - 'a'] = sRestricted.toString().equals(tRestricted.toString());
}
}
StringBuilder out = new StringBuilder();
for (int q = Integer.parseInt(in.readLine()); q > 0; q--) {
int sSum = 0;
int tSum = 0;
for (char x : subset) {
sSum += freqsS[x - 'a'];
tSum += freqsT[x - 'a'];
}
if (sSum != tSum) {
}
for (int j = 0; j < subset.length; j++) {
for (int k = j + 1; k < subset.length; k++) {
if (!compatible[subset[j] - 'a'][subset[k] - 'a']) {
}