Level 2. [1차] 뉴스 클러스터링

Level 2. [1차] 뉴스 클러스터링

문제의 설명 그대로 푼 풀이이다.

모든 영문자를 대문자로 바꾸고, 영문자만 2글자씩 쪼개고, 정렬한 후 set_union(), set_intersection() STL 함수를 이용해 합집합과 교집합의 크기를 구해 해결했다.

#include <iostream>
#include <vector>
#include <cctype>
#include <algorithm>

using namespace std;

int solution(string str1, string str2)
{
    int answer = 0;
    vector<string> setA, setB;
    vector<string> setUnion, setInter;

    transform(str1.begin(), str1.end(), str1.begin(), ::toupper);
    transform(str2.begin(), str2.end(), str2.begin(), ::toupper);

    for (int i = 0; i < str1.length() - 1; ++i)
        if (isalpha(str1[i]) && isalpha(str1[i + 1]))
            setA.emplace_back(str1.substr(i, 2));
    
    for (int i = 0; i < str2.length() - 1; ++i)
        if (isalpha(str2[i]) && isalpha(str2[i + 1]))
            setB.emplace_back(str2.substr(i, 2));

    sort(setA.begin(), setA.end());
    sort(setB.begin(), setB.end());

    set_union(setA.begin(), setA.end(), setB.begin(), setB.end(), back_inserter(setUnion));
    set_intersection(setA.begin(), setA.end(), setB.begin(), setB.end(), back_inserter(setInter));
            
    return setUnion.size() ? (1.f * setInter.size() / setUnion.size() * 65536) : 65536;
}

STL 함수를 사용하지 않고 합집합과 교집합의 크기를 구하는 방법이다.

모든 경우의 수만큼 배열 크기를 만들고 각 경우의 등장 횟수를 저장한다.

두 집합(배열)의 원소들을 비교해 등장 횟수가 많은 수를 합집합의 크기에 더하고, 작은 수를 교집합의 크기에 더하면 된다.

{AA, AA, AA, BB} ∪ {AA, AA} = {AA, AA, AA, BB}
{AB, AB, AC} ∩ {AB} = {AB}

#include <iostream>
#include <vector>
#include <cctype>
#include <algorithm>

using namespace std;

int solution(string str1, string str2)
{
    int answer = 0;
    int nUnion = 0, nInter = 0;
    vector<int> setA(676, 0), setB(676, 0);

    transform(str1.begin(), str1.end(), str1.begin(), ::toupper);
    transform(str2.begin(), str2.end(), str2.begin(), ::toupper);

    for (int i = 0; i < str1.length() - 1; ++i)
        if (isalpha(str1[i]) && isalpha(str1[i + 1]))
            ++setA[(str1[i] - 'A') * 26 + (str1[i + 1] - 'A')];

    for (int i = 0; i < str2.length() - 1; ++i)
        if (isalpha(str2[i]) && isalpha(str2[i + 1]))
            ++setB[(str2[i] - 'A') * 26 + (str2[i + 1] - 'A')];

    for (int i = 0; i < 676; ++i)
        nUnion += max(setA[i], setB[i]),
        nInter += min(setA[i], setB[i]);

    return nUnion ? (1.f * nInter / nUnion * 65536) : 65536;
}

영문자를 대문자 혹은 소문자로 바꿔 'h' - 'a' 방식으로 빼지 않고, 비트 연산을 이용한 풀이다.

아스키 코드에서 대문자와 소문자는 정확히 32(100000)만큼 차이가 나기 때문에 'G' & 31과 'g' & 31은 같다.

'G'(1000111) & 31(11111) = 00111
'g'(1100111) & 31(11111) = 00111
'A'가 1000001, 'Z'가 1011010이기 때문에 31(11111)로 마스킹해 준 것이다.

다만, 이때 'Z' & 31 * 26 + 'Z' & 31 = 26 * 26 + 26 = 702가 되기 때문에 배열의 범위를 늘려주어야 한다.

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int solution(string str1, string str2)
{
    int answer = 0;
    short nUnion = 0, nInter = 0;
    short setA[703] = {0}, setB[703] = {0};

    for (int i = 1; i < str1.length(); ++i)
        if (isalpha(str1[i - 1]) && isalpha(str1[i]))
            ++setA[(str1[i - 1] & 31) * 26 + (str1[i] & 31)];

    for (int i = 1; i < str2.length(); ++i)
        if (isalpha(str2[i - 1]) && isalpha(str2[i]))
            ++setB[(str2[i - 1] & 31) * 26 + (str2[i] & 31)];

    for (int i = 0; i < 703; ++i)
        nUnion += max(setA[i], setB[i]),
        nInter += min(setA[i], setB[i]);

    return nUnion ? (1.f * nInter / nUnion * 65536) : 65536;
}

저작자표시 비영리 변경금지

'자료구조 & 알고리즘 > 프로그래머스' 카테고리의 다른 글

Level 0. 배열 만들기 4 (0)	2023.05.13
Level 0. 배열 만들기 2 (0)	2023.05.12
Level 0. 커피 심부름 (0)	2023.05.10
Level 0. 코드 처리하기 (0)	2023.05.09
Level 2. [1차] 프렌즈4블록 (0)	2023.05.08

'자료구조 & 알고리즘 > 프로그래머스' 카테고리의 다른 글

검색 태그

티스토리툴바